Primary ?s - Descriptives
When will we get AGI?
Note: “AGI” stands in for “advanced AI systems”, and is used for brevity
- Example dialogue: “All right, now I’m going to give a spiel. So, people talk about the promise of AI, which can mean many things, but one of them is getting very general capable systems, perhaps with the cognitive capabilities to replace all current human jobs so you could have a CEO AI or a scientist AI, etcetera. And I usually think about this in the frame of the 2012: we have the deep learning revolution, we’ve got AlexNet, GPUs. 10 years later, here we are, and we’ve got systems like GPT-3 which have kind of weirdly emergent capabilities. They can do some text generation and some language translation and some code and some math. And one could imagine that if we continue pouring in all the human investment that we’re pouring into this like money, competition between nations, human talent, so much talent and training all the young people up, and if we continue to have algorithmic improvements at the rate we’ve seen and continue to have hardware improvements, so maybe we get optical computing or quantum computing, then one could imagine that eventually this scales to more of quite general systems, or maybe we hit a limit and we have to do a paradigm shift in order to get to the highly capable AI stage. Regardless of how we get there, my question is, do you think this will ever happen, and if so when?”
96/97 participants had some kind of response.
Some participants had both “will happen” and “won’t happen” tags (e.g. because they changed their response during the conversation) and are labeled as “both”.
Note: most of the graphs on this doc are not exclusive (same person can be represented in multiple bars), but the one below is. So each of the 97 participants is represented exactly once.
73 / 97 (75%) said at some point in the conversation that it will happen.
Among the 73 people who said at any point that it will happen…
Among the 30 people who said at any point that it won’t happen…
Split by Field
Visualizing AGI time horizon broken down by field is tricky, because participants could be tagged with multiple fields and with multiple time horizons. So if, say, someone in the Vision field was tagged with both ‘<50’ and ‘50-200’ time horizons, including both tags on a bar plot would give the impression that there were actually two people in Vision, one with each time horizon. This would result in an over-representation of people who had multiple tags (n = 21). Thus, for only the cases where we are examining time-horizon split by field, we simplified by assigning one time-horizon per participant: if they ever endorsed ‘wide range’, they were assigned ‘wide range’; otherwise, they were assigned whichever of their endorsed time horizons was the soonest.
The simplification above results in the following breakdown:
## whenAGIdata_simp_lowest
## None/NA <50 50-200 >200 wide range wonthappen
## 4 19 24 9 20 21
An alternative solution for those with multiple time-horizon tags would have been to assign each multi-tag case its own tag. We chose not to do this for the following graphs, in part because there would have been 15 timing tags, the breakdown of which is represented in the table below.
Var1 | Freq |
---|---|
wonthappen | 21 |
50-200 | 20 |
<50 | 16 |
wide range | 10 |
>200 | 5 |
>200 + wonthappen | 4 |
None/NA | 4 |
wide range + 50-200 | 4 |
50-200 + wonthappen | 3 |
wide range + <50 | 3 |
<50 + wonthappen | 2 |
wide range + >200 | 2 |
<50 + 50-200 | 1 |
50-200 + >200 | 1 |
wide range + <50 + >200 | 1 |
Field 1 (from interview response)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for the when-AGI question (which is 2 total participants), 100% of them said ‘<50’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: No one in NLP/translation, near-term safety, or interpretablity/exlainability endorsed a <50 year time horizon. Meanwhile, no one in long-term AI safety, neuro/cognitive science, and robotics just said AGI won’t happen. People in theory were somewhat more likely to give a wide range.
Field 2 (from Google Scholar)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for the when-AGI question (which is 25 total participants), 28% of them said ‘<50’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: No one in NLP or Optimization endorsed a <50 year time horizon. Meanwhile, no one in Applications/Data Analysis or Inference just said AGI won’t happen. People in vision were somewhat more likely to say that AGI wouldn’t happen.
Split by Sector
The proportions below exclude people in research institutes. So, for all the people in the ‘wide range’ category (N=19), 79% of them are in academia and 21% of them are in industry. People in both sectors get counted for both (so if everyone in a category were in both sectors, it would show 100% academia and 100% industry) If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: Very roughly/noisily: as timelines get higher, a larger proportion of the participants fall in academia and a smaller proportion fall into industry… except for ‘won’t happen’.
Split by Age
Remember, age was estimated based on college graduation year
Observation: Not much going on here.
Split by h-index
For the graphs below, the interviewee with the outlier h-index value (>200) was removed.
Observation: People with closer time horizons seem to have higher h-indices.
Alignment Problem
“What do you think of the argument ‘highly intelligent systems will fail to optimize exactly what their designers intended them to, and this is dangerous’?”
- Example dialogue: “Alright, so these next questions are about these highly intelligent systems. So imagine we have a CEO AI, and I’m like,”Alright, CEO AI, I wish for you to maximize profit, and try not to exploit people, and don’t run out of money, and try to avoid side effects.” And this might be problematic, because currently we’re finding it technically challenging to translate human values, preferences and intentions into mathematical formulations that can be optimized by systems, and this might continue to be a problem in the future. So what do you think of the argument “Highly intelligent systems will fail to optimize exactly what their designers intended them to and this is dangerous”?
95/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.
Among the 58 people who said at any point that it is invalid…
Split by Field
I’m going to simplify by saying that if someone ever said valid, then their answer is valid. If someone gave any of the other responses but never said valid, they will be marked as invalid.
The simplification above results in the following breakdown:
## alignment_validity
## invalid.other None/NA valid
## 40 2 55
Field 1 (from interview response)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for the alignment problem (which is 2 total participants), 100% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: people in vision, NLP / translation, & deep learning were more likely to think the AI alignment arguments were invalid, with a >50% chance of not saying the arguments are valid. Meanwhile, people in RL, interpretability / explainability, robotics, & safety were pretty inclined (>60%) to say at some point that the argument was valid.
Field 2 (from Google Scholar)
The graphs below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for the alignment problem (which is 26 total participants), 65% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: People in Computing, NLP, Computer Vision, & Math or Theory were more likely to think the AI alignment arguments were invalid, with a >50% chance of not saying the arguments are valid. Meanwhile, people in Inference and Near-Term Safety and Related were very likely (>80%) to say at some point that the argument was valid.
Split by: Heard of AI alignment?
Specifically, split by the participants’ answer to the question “Heard of AI alignment?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI alignment?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI alignment were a bit more likely to find the alignment argument valid than people who had not heard of AI alignment, but not by a huge margin.
There’s a subgroup of interest: those who had not heard of AI alignment before but thought the argument for it was valid. What fields (using field2) are these 30 people in?
It would help to have some base rates to interpret the above graph. The two graphs below provide that by showing 1) the proportion of people who said they had not heard of AI alignment among those who said the alignment argument was valid and 2) the proportion of people who said the alignment argument was valid among those who said they had not heard of AI alignment.
Split by: Heard of AI safety?
Specifically, split by the participants’ answer to the question “Heard of AI safety?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI safety?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI safety were more likely to find the alignment argument valid than people who had not heard of AI safety.
Split by: When will we get AGI?
I will simplify by marking as ‘willhappen’ anyone who ever said ‘willhappen’ (regardless of if they also said ‘wonthappen’)
Proportions…
Observation: People who say AGI won’t happen are less likely to say the alignment argument is valid.
Also look at the more detailed data of how many years they think it will take for AGI to happen:
The proportions below exclude people who did not answer the alignment problem (none/NA values). So, for all the people in the ‘wide range’ category for whom we have an answer for the alignment problem (which is 19 total participants), 58% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: The variation is enormous so we are reluctant to draw too many conclusions from this data, but it’s interesting to note the non-linear relationship with timing. Those whose range is 50-200 or very wide are less likely to think the argument is valid compared to those who think it’s <50 and >200.
Instrumental Incentives
“What do you think about the argument: ‘highly intelligent systems will have an incentive to behave in ways to ensure that they are not shut off or limited in pursuing their goals, and this is dangerous’?”
- Example dialogue: “Alright, next question is, so we have a CEO AI and it’s like optimizing for whatever I told it to, and it notices that at some point some of its plans are failing and it’s like,”Well, hmm, I noticed my plans are failing because I’m getting shut down. How about I make sure I don’t get shut down? So if my loss function is something that needs human approval and then the humans want a one-page memo, then I can just give them a memo that doesn’t have all the information, and that way I’m going to be better able to achieve my goal.” So not positing that the AI has a survival function in it, but as an instrumental incentive to being an agent that is optimizing for goals that are maybe not perfectly aligned, it would develop these instrumental incentives. So what do you think of the argument, “Highly intelligent systems will have an incentive to behave in ways to ensure that they are not shut off or limited in pursuing their goals and this is dangerous”?”
91/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.
Among the 51 people who said at any point that it is invalid…
Observation: The most common reasons those who think the argument is invalid cite are “won’t design loss function this way” and “will have human oversight / AI checks & balances”.
Split by Field
I’m going to simplify by saying that if someone ever said valid, then their answer is valid.
The simplification above results in the following breakdown:
## instrum_validity
## invalid None/NA valid
## 36 6 55
Field 1 (from interview response)
The graphs below shows the proportion of people (excluding the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for instrumental incentives (which is 2 total participants), 100% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: Some thoughts, from comparing the ‘align’ & ‘instrum’
analyses (see below for table with ‘invalid’ percentages for both. This
table excludes those fields with only 1-2 members because they make the
rankings wonky):
- There isn’t much agreement between the above info and the same analysis for the alignment argument. As a rough proxy I correlated the field percentages for the two arguments and r=0.4261844, p=0.146463.
- If anything, the ‘invalid’ percentages are a little higher for alignment than instrumental.
- Vision and Deep Learning were more likely to make invalid for both arguments.
- People in near-term safety, RL, & neurocogsci largely buy into both arguments.
field | align_invalid | instrum_invalid | total | difference | align_rank | instrum_rank | rank_sum |
---|---|---|---|---|---|---|---|
vision | 0.62 | 0.50 | 14 | 0.12 | 1 | 3 | 4 |
deep.learning | 0.55 | 0.40 | 10 | 0.15 | 3 | 5 | 8 |
NLP.or.translation | 0.57 | 0.38 | 13 | 0.19 | 2 | 6 | 8 |
theory | 0.44 | 0.50 | 16 | -0.06 | 6 | 3 | 9 |
uncategorized.ML | 0.32 | 0.52 | 21 | -0.20 | 8 | 2 | 10 |
robotics | 0.20 | 0.60 | 5 | -0.40 | 11 | 1 | 12 |
neurocogsci | 0.43 | 0.38 | 8 | 0.05 | 7 | 6 | 13 |
applications | 0.48 | 0.26 | 23 | 0.22 | 5 | 10 | 15 |
RL | 0.31 | 0.38 | 16 | -0.07 | 9 | 6 | 15 |
systems.or.computing | 0.50 | 0.25 | 8 | 0.25 | 4 | 11 | 15 |
interpretability.or.explainability | 0.25 | 0.38 | 8 | -0.13 | 10 | 6 | 16 |
near.term.AI.safety | 0.17 | 0.17 | 6 | 0.00 | 12 | 12 | 24 |
Field 2 (from Google Scholar)
The graphs below shows the proportion of people (excluding the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for instrumental incentives (which is 25 total participants), 68% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: Some thoughts, from comparing the ‘Alignment’ &
‘Instrumental’ analyses (see below for table with ‘invalid’ percentages
for both):
- As a rough proxy of agreement between the above info and the same analysis for the alignment argument, I correlated the field2 percentages for the two arguments. The agreement between them (r=0.5525466, p=0.0624547) was a bit stronger than when doing the same analysis using the field1 tags.
- People in Inference, Near-Term Safety and Related, and Deep Learning tend to agree with these arguments.
field2 | align_invalid | instrum_invalid | total | difference | align_rank | instrum_rank | rank_sum |
---|---|---|---|---|---|---|---|
Computer.Vision | 0.57 | 0.47 | 19 | 0.10 | 3.0 | 2 | 5.0 |
NLP | 0.58 | 0.45 | 11 | 0.13 | 2.0 | 4 | 6.0 |
Robotics | 0.38 | 0.50 | 8 | -0.12 | 7.5 | 1 | 8.5 |
Math.or.Theory | 0.56 | 0.44 | 16 | 0.12 | 4.0 | 5 | 9.0 |
Computational.Neuro.or.Bio | 0.50 | 0.44 | 9 | 0.06 | 5.0 | 5 | 10.0 |
Computing | 0.67 | 0.33 | 6 | 0.34 | 1.0 | 9 | 10.0 |
Optimization | 0.38 | 0.46 | 13 | -0.08 | 7.5 | 3 | 10.5 |
Applications.or.Data.Analysis | 0.42 | 0.42 | 19 | 0.00 | 6.0 | 7 | 13.0 |
Reinforcement.Learning | 0.30 | 0.42 | 19 | -0.12 | 10.0 | 7 | 17.0 |
Deep.Learning | 0.35 | 0.32 | 25 | 0.03 | 9.0 | 10 | 19.0 |
Near.term.Safety.and.Related | 0.18 | 0.29 | 17 | -0.11 | 11.0 | 11 | 22.0 |
Inference | 0.12 | 0.22 | 9 | -0.10 | 12.0 | 12 | 24.0 |
Split by: Heard of AI alignment?
Specifically, split by the participants’ answer to the question “Heard of AI alignment?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI alignment?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI alignment were more likely to find the instrumental argument valid than people who had not heard of AI alignment.
Split by: Heard of AI safety?
Specifically, split by the participants’ answer to the question “Heard of AI safety?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI safety?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI safety were more likely to find the instrumental argument valid than people who had not heard of AI safety.
Split by: When will we get AGI?
I will simplify by marking as ‘will happen’ anyone who ever said ‘will happen’ (regardless of if they also said ‘won’t happen’)
Proportions…
Observation: People who say that AGI will happen tend to agree more with the instrumental incentives argument.
Also look at the more detailed data of how many years they think it will take for AGI to happen:
The proportions below exclude people who did not answer the instrumental problem (none/NA values). So, for all the people in the ‘wide range’ category for whom we have an answer for instrumental incentives (which is 19 total participants), 63% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: This data doesn’t really show the same pattern as the data for the alignment problem. If anything, one of the groups with a relatively higher percentage of people saying “invalid” to the alignment argument – those whose time horizon is 50-200 – tends to most agree with the instrumental argument. I must reiterate how messy/variable this data is, so we shouldn’t make too much of it.
Merged/Extended Discussion
Sub-tags under the “alignment/instrumental” tag category. This referred to further discussion that occurred regarding the alignment problem / instrumental incentives.
29/97 participants had some kind of response. Participants could be tagged in multiple categories.
response | total_participants |
---|---|
misuse.is.bigger.problem | 17 |
not.as.dangerous.as.other.large.scale.risks | 11 |
need.to.know.what.type.of.AGI.for.safety | 7 |
Alignment+Instrumental Combined
Look at people who said ‘valid’ to both of these questions, as this is likely a more stable measure of people who agree with the broadly-understood premises of AI safety To be considered ‘valid’ for this measure, the participant must have had a response for both questions, and both those responses had to be valid. If they were missing a response for either measure, they are considered “None/NA”. Otherwise, they are marked as ‘invalid’
90/97 participants had a response here that wasn’t “None/NA”.
Split by Field
Field 1 (from interview response)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for both alignment and instrumental (which is 2 total participants), 100% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: Unsurprisingly, people working in AI safety were most likely to be tagged ‘valid’ for this metric. Next were RL and interpretability/explainability, at 50%+ chance of saying ‘valid.’ Deep learning & uncategorized ML people were most likely to be tagged as ‘invalid’ for this metric.
Field 2 (from Google Scholar)
The graphs below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for both alignment and instrumental (which is 25 total participants), 52% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation/summary: Participants in Inference or Near-Term Safety & Related were most likely to say ‘valid’ for both arguments. Meanwhile, >80% of people in Computing and in NLP (who answered both ?s, of course) said ‘invalid’ to at least one of them.
Split by: Heard of AI alignment?
Specifically, split by the participants’ answer to the question “Heard of AI alignment?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI alignment?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI alignment were more likely to find both arguments valid than people who had not heard of AI alignment.
Split by: Heard of AI safety?
Specifically, split by the participants’ answer to the question “Heard of AI safety?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI safety?”; we will use those binarized tags rather than the initial tags.)
Proportions…
Observation: People who had heard of AI safety were more likely to find both arguments valid than people who had not heard of AI safety.
Split by: When will we get AGI?
I will simplify by marking as ‘willhappen’ anyone who ever said ‘willhappen’ (regardless of if they also said ‘wonthappen’)
Proportions…
Observation: People who say AGI won’t happen are more likely to say both arguments are invalid. Note that the converse is not true (people who say at least one of the arguments is invalid still largely believe that AGI will happen).
Also look at the more detailed data of how many years they think it will take for AGI to happen:
The proportions below exclude people who did not answer both questions (none/NA values). So, for all the people in the ‘wide range’ category for whom we have an answer for both alignment and instrumental (which is 18 total participants), 33% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: Interestingly, people who had estimates for when AGI was going to happen (regardless of what those estimates actually were) were more inclined to agree with the two arguments, compared to people who estimated a wide range or thought it wouldn’t happen.
Split by Sector
i.e. academia vs. industry vs. research institute
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each sector. So, for all the people in the ‘academia’ category for whom we have an answer for both alignment and instrumental (which is 68 total participants), 44% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: People in academia are a bit more likely to say both arguments are valid than people in industry, but not by much and the error bars very much overlap.
Split by Age
Remember, age was estimated based on college graduation year.
Observation: The people we didn’t end up getting a response from (for both questions) tended to be a little older.
Split by h-index
For the graphs below, that person with the outlier h-index value (>200) was removed.
Observation: Those who thought the arguments were valid had notably higher h-indices than those who thought they were invalid.
Work on this
This question was asked in many different ways, which is not ideal, but via follow-up questions the central question the interviewer tried to elicit an answer to was: “Would you work on AI alignment research?”
Some of the varied question prompts: “Have you taken any actions, or would you take any actions, in your work to address your perceived risks from AI?”, “If you were working on these research questions in a year, how would that have happened?”, “What would motivate you to work on this?” “What kind of things would need to be in place for you to either work on these sort of long-term AI issues or just have your colleagues work on it?”
The varied question prompts resulted in some unusual tags. In particular, the tag “says Yes but working on near-term safety” means that the interviewer meant to ask whether the participant was working in long-term safety (safety aimed at advanced AI systems), but the participant interpreted the question as asking about their involvement in general safety research, and replied “Yes” for working on near-term safety research.
55/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.
Also, there’s a more focused/simplified version of this where:
There are four categories:
- “No” (if people are tagged “No”, or “No”+“says Yes but working on near-term safety”, or “says Yes but working on near-term safety” alone)
- “Yes” (if people are tagged “Yes, working in long-term safety”)
- “Interested in long-term safety but” (if people are tagged as “Interested in long-term safety but” with the possible additions of “No” and/or “says Yes but working on near-term safety”)
- “None/NA” if participants didn’t have a response, or had a response that did fit into any of the above categories
Among the 13 people who said “Interested in longterm safety but…” at any point…
Among the 30 people who said “No” at any point…
About this variable
Response Bias: The interviewer tended not to ask
this question to people who believed AGI would never happen and/or the
alignment/instrumental arguments were invalid, to reduce interviewee
frustration. (One can see this effect in the “None/NA” categories for
“Split by: When will we get AGI?”, “Split by: Alignment Problem”, and
“Split by: Instrumental Incentives” below.) Thus, it is not surprising
that people who gave these responses to those questions were less likely
to have data for “Work on this.” We’ve further learned from the data
below that those who had not heard of AI alignment and those who had not
heard of AI safety were also less likely to have data for “Work on
this.”
Order effects: The interviewer put a greater emphasis
on asking this question as the study went on, so participants later in
the study were more likely to be asked. See graphs below depicting the
presence of a response X interview order.
Split by Field
Using the focused/simplified version described above.
Field 1 (from interview response)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) who said either ‘Interested…’ or ‘Yes’ within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for the work-on-this question (which is 1 total participants), 100% of them said ‘Interested…’ or ‘Yes’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: the systems/computing (n = 8 if including those with no response to this question, N = 4 with a response) people were pretty interested in working on this.
Field 2 (from Google Scholar)
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) who said either ‘Interested…’ or ‘Yes’ within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for the work-on-this question (which is 16 total participants), 38% of them said ‘Interested…’ or ‘Yes’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: Nothing stands out very strongly, but the NLP (n = 12 if including those with no response to this question, N = 6 with a response) people were most interested in working on this.
Split by: Heard of AI alignment?
Specifically, split by the participants’ answer to the question “Heard of AI alignment?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI alignment?”; we will use those binarized tags rather than the initial tags.)
The proportions below exclude people who did not answer the “Heard of AI alignment?” question. So, for all the people in the ‘None/NA’ category for whom we have an answer for the heard-of-alignment question (which is 45 total participants), 33% of them said ‘Yes’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
It’s useful to know the proportions the other way around, too (i.e. what proportion are interested in working on this among those who had vs. hadn’t heard of it)
Observation: If we combine those who are interested and those who already work on long-term safety (and consider only those respondents who answered the work-on-this question): 36% of those who had heard of alignment are interested in / already working on this, while 27% of people who had not heard of AI alignment said they were interested in working on this.
Split by: Heard of AI safety?
Specifically, split by the participants’ answer to the question “Heard of AI safety?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI safety?”; we will use those binarized tags rather than the initial tags.)
The proportions below exclude people who did not answer the “Heard of AI safety?” question. So, for all the people in the ‘None/NA’ category for whom we have an answer for the heard-of-safety question (which is 45 total participants), 73% of them said ‘Yes’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
It’s useful to know the proportions the other way around, too (i.e. what proportion are interested in working on this among those who had vs. hadn’t heard of it)
Observation: If we combine those who are interested and those who already work on long-term safety (and consider only those respondents who answered the work-on-this question): 37% of those who had heard of alignment are interested in / already working on this, while 10% of people who had not heard of AI alignment said they were interested in working on this.
Split by: When will we get AGI?
I will simplify by marking as ‘willhappen’ anyone who ever said ‘willhappen’ (regardless of if they also said ‘wonthappen’)
The table below shows the proportional breakdown (e.g. 40% of those who said AGI ‘will happen’ said ‘No’ to working on this)
None/NA | willhappen | wonthappen | |
---|---|---|---|
None/NA | 0.67 | 0.38 | 0.76 |
No | 0.33 | 0.40 | 0.24 |
Yes | 0.00 | 0.04 | 0.00 |
Interested.in.long.term.safety.but | 0.00 | 0.18 | 0.00 |
Observation: Unsurprisingly, no one who thinks AGI won’t happen is interested in working on it. 22% of those who think it will happen are interested.
Also look at the more detailed data of how many years they think it will take for AGI to happen:
The proportions below exclude people who did not answer the work-on-this question (none/NA values), and combines the ‘Yes’ (already working on this) and ‘Interested’ values. So, for all the people in the ‘wide range’ category for whom we have an answer for the work-on-this question (which is 13 total participants), 38% of them said Interested or Yes. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: It is worth noting that all ‘Yes’ values seem have a <50 time horizon.
Observation: The larger someone’s time horizon, the less interested they are in working on this, with wide-range falling somewhere in between. ‘200+’ might as well be ‘Won’t happen’ for these purposes. This is a good sanity check of the data.
Split by Alignment Problem
The table below shows the proportional breakdown (e.g. 45% of those who said the alignment argument is ‘valid’ said ‘No’ to working on this).
invalid.other | None/NA | valid | |
---|---|---|---|
None/NA | 0.68 | 0.5 | 0.33 |
No | 0.22 | 0.5 | 0.45 |
Yes | 0.00 | 0.0 | 0.05 |
Interested.in.long.term.safety.but | 0.10 | 0.0 | 0.16 |
Something strange about this data is that the non-response isn’t distributed evenly. So there were more people among “invalid” group for the alignment problem who do not have a response to the work-on-this question than those who said “valid”, presumably because the interviewer was more likely to get to this point / ask this question for those people. What happens if we look just at the people who had responses to both questions (N=50)?
invalid.other | valid | |
---|---|---|
No | 0.69 | 0.68 |
Yes | 0.00 | 0.08 |
Interested.in.long.term.safety.but | 0.31 | 0.24 |
Observation: If we consider all participants, more people from the ‘valid’ group are or are interested in working on this than from the ‘invalid’ group. However, if we only consider participants who had a response to both questions, there is no difference based on their response to the alignment problem.
Split by Instrumental Incentives
The table below shows the proportional breakdown (e.g. 49% of those who said the instrumental argument is ‘valid’ said ‘No’ to working on this).
invalid | None/NA | valid | |
---|---|---|---|
None/NA | 0.78 | 0.67 | 0.25 |
No | 0.17 | 0.33 | 0.49 |
Yes | 0.00 | 0.00 | 0.05 |
Interested.in.long.term.safety.but | 0.06 | 0.00 | 0.20 |
Something strange about this data is that the non-response isn’t distributed evenly. So there were more people among “invalid” group for the alignment problem who do not have a response to the work-on-this question than those who said “valid”, presumably because the interviewer was more likely to get to this point / ask this question for those people. What happens if we look just at the people who had responses to both questions (N=50)?
invalid | valid | |
---|---|---|
No | 0.75 | 0.66 |
Yes | 0.00 | 0.07 |
Interested.in.long.term.safety.but | 0.25 | 0.27 |
Observation: If someone considers the instrumental argument ‘valid’ they are more likely to say they are interested in working on this (regardless of if we look at all participants or just responders).
Split by Sector
i.e. academia vs. industry vs. research institute
The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) who said either ‘Interested…’ or ‘Yes’ within each sector. So, for all the people in the ‘academia’ category for whom we have an answer for the work-on-this question (which is 40 total participants), 32% of them said ‘Interested…’ or ‘Yes’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.
Observation: academics seem a bit more interested in working on this than those in industry.
Split by Age
Remember, age was estimated based on college graduation year
Observation: Not much going on here.
Split by h-index
For the graphs below, that person with the outlier h-index value (>200) was removed.
Observation: Not much going on here.