Time to go to the talkies!

📎

https://talkie-lm.com/chat - to chat with "Talkie"
https://talkie-lm.com/introducing-talkie - Original blog post

Just for a bit of fun, do you want to see what an LLM created in the 1930s would say? Check out Talkie in the link above.

All modern LMs are basically trained on the same enormous set of data scrapped from the web and digitised versions of older paper publications. While there are differences we can see between all the major models they do all have very similar behaviours and are all able to reference the same set of data either during training or during inference with access to the web.

Limiting the data-set to pre-1930 text gives a fascinating look, at least in theory, at what might be different in our collective culture compared to back then. You can also investigate (as the creators of the model have): does it anticipate inventions it never saw? Can it learn to code from examples despite having no knowledge of computers? How much of what we think we know about LMs is actually just "things web-trained models do"?

I tried asking a question that is topical at the moment with the re-emergence of One Nation as a political force, at least if you believe the polls. What was surprising was that on running the same prompt multiple times I got responses which were in some ways similar and in others very different. Almost every time I got responses mentioning some variation of:

the importance of maintaining a standard of living for the current population
having numbers that maintain "healthful growth", whatever that means
some mention of considering the differing needs of states/towns based on the main industries
a focus on settling mainly in country areas

What was different was the way it would oscillate with relatively progressive views "The national origin of the immigrants should be left to individual choice" and some quite less-so "The Latin nations should be kept out. China and India should be absolutely prohibited." See below screenshots of two examples illustrating this.

I tried a few other types of prompts and got a similar mix of responses. It does make me think, is it really just a model based 1930's text or is there some reinforcement learning going on to try and steer away from some of the most controversial statements it might make? The original blog post does note: