Log In

ChinAI #315: Abandoned? Checking in on Three Key AI Safety Benchmarks

Published 4 hours ago3 minute read

Greetings from a world where…

Department Q is good TV

…As always, the searchable archive of all past issues is . Please please to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).

Have we done that thing where we lead off with a Chinese idiom? Not yet? Well, you have to do it. How about this one: Everything has a beginning but not always an ending [靡不有初,鲜克有终]. It’s a reminder to follow through and finish what you start.

For this week’s issue, I checked in on the latest updates to three key AI safety benchmarks developed by Chinese organizations.

*Note: Two of these benchmark developers, CAICT and the Shanghai AI Lab, have been identified as among the “.”

Unfortunately, based on my review of developments in these three AI safety benchmarks since their launch, follow-through is lacking. Let’s run through these in order.

I don’t want to overstate the case here. Abandoned is probably too harsh. For one, it may be that I missed some things in my scan. For instance, in December 2024 SuperCLUE published a separate AI security benchmark called , which is joint work with the Third Research Institute of the Ministry of Public Security. While it is not linked to the safety benchmark, it does include prompts that gauge resilience to jailbreak attacks. Second, in some cases, model developers can run their own evaluations on these benchmarks, without centralized coordination from the host organization. Maybe Chinese firms will end up relying on international AI safety benchmarks, instead of these efforts led by Chinese organizations.

So, at the end of the day, perhaps this post is really just a reminder to myself to follow through: to check in months and years after the initial hullabaloo.

With research assistance from Ruby Qiu, Kendra Schaefer’s Trivium post analyzes the 3,739 generative AI tools that are listed in China’s algorithm registry. Her preliminary analysis is very insightful; more importantly, she helps contextualize why this is just a valuable starting point for other research into China’s AI ecosystem:

The fact that this data set exists is pretty incredible. Imagine having access to a definitive list of all public-facing generative algorithms operating in the US. But due to China’s rather heavy-handed governance of the online environment, we have this very robust tool we can use to assess the state of China’s AI ecosystem.”

Concordia’s AI safety newsletter continues to be a useful roundup of developments in China. I went back to this April 2024 issue to get background info on CAICT’s AI Safety Benchmark.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.

Check out the archive of all past issues & please to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Also! Listen to narrations of the ChinAI Newsletter in podcast format .

Discussion about this post

Origin:
publisher logo
ChinAI Newsletter
Loading...
Loading...
Loading...

You may also like...