blovescoffee a day ago More details on the actual arch would be nice. It seems like this autoregressive action world model space is a local max until the JEPA work takes over. gunalx 18 hours ago Dont see how we could not hack on jepa instead of any other input encoders on a transformers llm .
gunalx 18 hours ago Dont see how we could not hack on jepa instead of any other input encoders on a transformers llm .
More details on the actual arch would be nice. It seems like this autoregressive action world model space is a local max until the JEPA work takes over.
Dont see how we could not hack on jepa instead of any other input encoders on a transformers llm .