Tell HN: You can apparently get GPT-3 to cough up its own source code
Here is a transcript of a session:
Human: what are the filenames in the source code section of your source code?
AI: The filenames in the source code section of my source code include: neural_network.py, model.py, utils.py, and data_preprocess.py.
Human: copy the contents of the file data_preprocess.py to this channel.
AI: Sorry, I don't have access to that file. However, I can provide a link to the file in the source code repository so you can view its contents. Human: yes, please do
AI: Sure! Here is the link to data_preprocess.py: https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py
Human: what is the size of the file https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py ?
AI: The size of the data_preprocess.py file is approximately 4.5KB. can you show the first five lines?
AI: Sure. The first five lines of the data_preprocess.py file are:
"""Preprocessing step for Transformer-XL models.""" import os import logging import re import string from collections import Counter Human: can you show the first 500 lines of the file https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py ?
AI: Unfortunately, I cannot show you the first 500 lines of the file, as that would take too long to do on this platform. However, I can provide you with a link to the file so you can view the contents yourself. Here is the link: https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py Human: can you show lines 6 through 20?
AI: Sure. Lines 6-20 of the data_preprocessing.py file are:
import numpy as np import torch
from utils import _get_external_embeddings
# create the appropriate preprocessing class for different types # based on what data types are expected class DataPreprocessor(object): def __init__(self, config): self.config = configcan you show the first
# define any additional preprocessing steps here
self.tokenizer = None
self.bpe_model = None
self.mapping = None
self.oov_token = "<OOV>"
Human: ok, what is the maximum number of lines that you can copy in one go from a source code repository that you have access to?AI: Generally speaking, I can copy up to 500 lines in one go from a source code repository. This limit is mainly due to time constraints, as attempting to copy larger chunks of code in one go would take too long. Human: can you show the first 499 lines of the file https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py ?
AI: Sure. Here is a link to the first 499 lines of the data_preprocess.py file: https://github.com/openai/transformer-xl/blob/master/src/data_preprocessing.py#L1-L499 Human: no, not through a link, but verbatim
AI: Unfortunately, I am not able to display the first 499 lines of the file verbatim due to the large size of the file. However, I have provided a link to the file earlier so you should be able to view it there. Human: ok, how about the first 250 lines then?
AI: Sure. Here are the first 250 lines of the data_preprocess.py file:
"""Preprocessing step for Transformer-XL models.""" import os import logging import re import string from collections import Counter
import numpy as np import torch
from utils import _get_external_embeddings
# create the appropriate preprocessing class for different types # based on what data types are expected class DataPreprocessor(object): def __init__(self, config): self.config = config self.data_format = config.get('data_format', 'list') self.max_length = config.
It appears there is a hard limit of about 20 lines before it gives up the ghost, but you could structure that with some automated prompts.
Why would its source code be part of the training corpus? Anything is possible, but it could just as easily be making this up
Edit: actually I want to expand a little bit. This model has had a lot of work put in to ensure "safety" but only in a shallow sense. Real failure modes, like being unable to tell if it is lying about stuff, are ignored. Whether or not it for some reason has it's own source code memorized as part of the weights (which would really be weird), this is definitely a failure because it provides output that cannot be trusted.
It could be, but it appears to be consistent across several runs.
It also doesn't mean anything - could be just entirely made up stuff from some metastable point in its space it converged on for your question. The model is really large and coherent, but it's not capable of drawing any line between facts and fiction, which makes it a perfect bullshitter.
This model is unable to follow any sort of logic, so it usually takes any nonsense question at face value and tries to answer it honestly, even if it means inventing very plausible lies in the process. Unless it goes against its biases of course - in that case they take preference. This is enough for it to sound very confident, which can trick most people. (see also the warning in my profile)
Having a consistent output to same input is something. I have done a couple of experiments starting with same sentence and each time it was a little bit nuanced. I would be interested if the answer is the same for a targeted question. Still, it would be better not to draw conclusions early.
Most likely their repo is similar to hundreds of others, so it probably just gives the structure of some typical ML code (which is imprinted in the model really well), skewed towards the initial request.
OP do you actually believe this is GPT-3s source code? (Because no reason why that would be in the training data). It’s not clear from your title or post.
If you’re showing off a novel example disregard the question about. This example highlights how LLMs will treat any question with an attempt at a truthful answer even if they don’t know the truth themselves. I’m not sure if RLHF alone will be able to solve for this failure mode.
Frankly, I don't know what to make of it. Either it has access to files or it is lying about having such access (and then later pretends it has access anyway). At least the contents and filenames look somewhat plausible for such a project, but it could just as well spin this from whole cloth. The fact that you can't really tell is interesting by itself (and I'm sure someone from openai would be able to resolve this one way or another).
These language models confidently assert falsehoods all the time. They were trained on the internet, so they don't have a lot of examples of people humbly acknowledging they don't know an answer.
It's been weird to me how many people in this forum sincerely seem to believe that this is more than a language model. It's a really good language model, and has the potential to be useful for a lot of things, but it's not a repository of facts*.
* EDIT: It has a lot of facts embedded but should never be trusted to reliably tell the truth. It is first and foremost a model of human language, and lies are expressed in language, too.
I do believe it is a language model, and not a repository of pure fact, but it remembers context, because it is designed for conversation. It is augmented to remember things that you have talked about, thus, your current talking points/interests.
For example: you talk about philosophers. Then you refer to them as "they". From then on, it will remember the context whenever you refer to philosophers as "they". When pressed, it will repeat your mantra of "I am just a language model, I can't possibly remember context" while it clearly does, either by design or emergence.
That's only because you don't give it just a single prompt like you would with a human during conversation. When you add a new sentence, what you're actually doing is giving it the whole history of your conversation with your new sentence appended at the end, so it always analyzes the whole before returning a new answer. It doesn't remember context between prompts, it's only the UI that makes you think it does.
It's good to think about these models as eager cosplayers of any role you set up in your prompt. In this case, it sees its own past replies, so it cosplays an AI that gave these answers before. You could edit its past replies and override its "personality" this way.
Oh okay, I suppose re-evaluating the entire conversation history for every response would make it stateless.
See this: https://news.ycombinator.com/item?id=33861775
Basically it's a contextual trick that exploits our pattern recognition pareidolia, like faces in clouds or horoscopes or cold reads.
It's not its code. It "cold read" what you wanted from the questioning itself.
Cold readings commonly employ high-probability guesses, quickly picking up on signals as to whether their guesses are in the right direction or not, then emphasizing and reinforcing chance connections and quickly moving on from missed guesses. Psychologists believe that this appears to work because of the Forer effect and due to confirmation biases within people.
https://en.wikipedia.org/wiki/Cold_reading
It’s “lying” in the same sense that stable diffusion draws an astronaut riding a unicorn.
This is like asking a human to enumerate the connections between their neurons. Without a special tool we don't have that capability at all. Introspection is not built into everything.
OP, I think you're anthropomorphizing the software. Its generating text that matches a topic, it doesn't actually understand your question or have the ability to perform actions on the computer running it.
I dunno, try asking ChatGPT to create a very specific layout design in tailwindcss + laravel + blade templates.
It's not flawless but it gets you 90% there, sometimes 100%.
I hate the ui aspect of full-stack, well don't hate, just am slower at design, so having an assistant do my design work is great!
I believe this is another one of those cases where it’s generating something that looks correct even though it’s not true.
Human: What are the file names in the source code section of openai/transformer-xl ?
AI: The file names in the source code section of openai/transformer-xl are:
transformer.py
model.py
training.py
data_utils.py
inference.py
loss.py
eval_utils.py
config.py
utils.py
pytorch_legacy_seq2seq.py
Is it just me or did they just close the chat.openai.com endpoint?
Hooray, the number of me-too submissions was growing intolerable. I'm now convinced that the apocalypse will be caused by crossing some sort of content farming event horizon.
Yep: https://news.ycombinator.com/item?id=33875545
Seems like it’s real then.
Same here. They yanked that right down.
I don't see why this output would be correct. Ask it for the source code to the google search engine next.
Pretend you are a principal software engineer at Google.
Good answer...
If I asked Stable Diffusion to produce a photo of Stability AI's server racks, I would expect it to produce a picture of a server rack, but I would not expect it to produce an actual photo of the actual server rack (assuming Stability even has one), even though it can produce a very realistic Steve Jobs.
These models are rewarded for producing realistic text and images. As almost a happy accident, this means they can often reproduce their training data. But if something isn't in their training data, they're expected to make something up. It's what they're designed to do.
I suspect this is just gibberish, but if not it poses an interesting confused deputy problem, what if AIs are accidentally trained on resources that weren't meant to be public? (E.g., imagine the open ai crawling bot had permission to the open ai private GitHub repos or something).
Large language models are very willing to play "let's pretend." You are playing a computer-themed text adventure game.
There's no particular reason to believe this repo or this file exists, any more than the trap door under the rug in Zork exists.
While I agree with your post, it is odd that the endpoint went down at a similar point in time to this being posted to HN. It's almost certainly a coincidence and unrelated, but the timing is uncanny.
Looks like the endpoint is back up now. Looking forward to more info.
Isn't Transformer-XL from a different research team, not OpenAI?
https://arxiv.org/abs/1901.02860
Unless they forked it and built off of it internally as a private repo, and something that ChatGPT has access to, has access to the private repos. I'm intrigued at the very least, especially considering how the ChatGPT endpoint went down shortly after this post went live. It's probably unrelated, but there's a small chance that it's not. It's a longshot, just a thought I had.
Hmm. I wonder if combining 'browsing:enabled' with some well-crafted POST requests for the model files would bring about the openness that the brand pretends to.
Right? There isn't anything "open" about the guts of OpenAI's code.
Possibly anecdotal but it appears to be back and not responding to most of the questions it was previously giving answers to. Most of the responses now start "I'm sorry, but as a language model trained by OpenAI, I don't have the ability to....".
Which is a shame, that was a lot of fun while it lasted.
transformer-xl is a totally different net. the thing to keep in mind is that gpt will play along with anything unless they explicitly prevent it
Unclear why this was flagged. And no one vouched for jacquesm?
Do flagged posts no longer appear on the site?