OpenAI has a new coherent text generator called GPT-2 that according to Elon Musk, backer, “Is so good it is scary.”
Don’t worry, Open AI knows we are not ready for the avalanche of fake news it could generate – Fakebookers do that well enough now. But it is worried enough to warn that bad actors (and we don’t mean Kevin Spacey) could create and place “deceptive, biased, or abusive language at scale.”
In brief, it is called GPT-2 (blog here), and you only have to give it a few facts and let it do the rest. We are not sure where its creativity comes from, but the writing style is pretty good. In fact, GPT-2 has over 1.5 billion language parameters and a dataset of 8 million web pages. Eventually, it will read them all!
GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, when it is given a model input. It then generates a lengthy continuation.
Simple Example of GPT-2
System Prompt (human-written): Miley Cyrus was caught shoplifting from Abercrombie and Fitch on Hollywood Boulevard today.
GPT-2: Shoplifting: Miley Cyrus was caught shoplifting from Abercrombie and Fitch on Hollywood Boulevard today (pictured)
The singer was wearing a black hoodie with the label ‘Blurred Lines’ on the front and ‘Fashion Police’ on the back
The singer was also wearing a pair of black-rimmed glasses, a black jacket, black jeans and black sandals.
She was carrying a pair of black and white striped gloves and a small black bag.
There is a great short piece on recycling here where six words create 319 presented in a fairly logical argument.
GPT-2 can summarise and curate news items too
Because it can trawl the web, you can give it basic facts and have it write curated briefs of the news covering every aspect.
The GPT-2 blog continues
Large, general language models could have significant societal impacts, and also have many near-term applications. We can anticipate how systems like GPT-2 could create:
- AI writing assistants
- More capable dialogue agents
- Unsupervised translation between languages
- Better speech recognition systems
We can also imagine the application of these models for malicious purposes, including the following (or other applications we can’t yet anticipate):
- Generate misleading news articles
- Impersonate others online
- Automate the production of abusive or faked content to post on social media
- Automate the production of spam/phishing content
In the wrong hands ‘Deep Fakes’ – generating genuine fake content and waging disinformation campaigns will mean the public will need to become more sceptical of what they find online.
GadgetGuy’s take: GPT-2 could be great but …
Trusted sources like GadgetGuy will become a valuable reference. But we need to both earn and maintain trust over time.