Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • About Bonfire
Flipboard Tech Desk
@TechDesk@flipboard.social  ·  activity timestamp last week

How easy is it to "poison" a large language model's data? Much easier than experts previously thought. New research from the Alan Turing Institute indicates that only 250 documents are needed to be inserted in order to manipulate a model's behavior. Here's more from institute's blog, including a link to the original paper.

https://flip.it/Dz8pC3

#Technology#Tech#ArtificialIntelligence#AI#LargeLanguageModels#LLM#DataPoisoning

  • Copy link
  • Flag this post
  • Block
noplasticshower
@noplasticshower@infosec.exchange replied  ·  activity timestamp last week
@TechDesk sorry. But this is so underspecified here as to be completely meaningless.
  • Copy link
  • Flag this comment
  • Block
Flipboard Tech Desk
@TechDesk@flipboard.social replied  ·  activity timestamp last week

HI @noplasticshower, here's a link to the paper:

https://arxiv.org/pdf/2510.07192

  • Copy link
  • Flag this comment
  • Block
noplasticshower
@noplasticshower@infosec.exchange replied  ·  activity timestamp last week
@TechDesk thank you. We will add this to our BIML reading list. Do note that anthropic is terrible at science in our experience. Not sure about the others.

(See, for example, https://berryvilleiml.com/2024/02/08/absolute-nonsense-from-anthropic-sleeper-agents/)

  • Copy link
  • Flag this comment
  • Block
Cybarbie
@nf3xn@mastodon.social replied  ·  activity timestamp last week
@TechDesk It's called 'Answer Engine Optimisation'
  • Copy link
  • Flag this comment
  • Block
Log in

Bonfire Dinteg Labs

This is a bonfire demo instance for testing purposes. This is not a production site. There are no backups for now. Data, including profiles may be wiped without notice. No service or other guarantees expressed or implied.

Bonfire Dinteg Labs: About · Code of conduct · Privacy ·
Bonfire social · 1.0.0-rc.3.15 no JS en
Automatic federation enabled
  • Explore
  • About
  • Code of Conduct
Home
Login