Victor Ashioya's Shortform

The new open source model from Google

[-]MiguelDev8mo10

https://huggingface.co/google/gemma-7b

The 7b and 2B opensource version of Google's Gemini.

[-]Victor Ashioya6mo10

I watched Sundar's interview segment on CNBC and he is asked about Sora using Youtube data but he appears sketchy and vague. He just says, "we have laws on copyright..."

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)

From the abstract:

Based on the framework, we design JailbreakLens, a visual analysis system that enables users to explore the jailbreak performance against the target model, conduct multi-level analysis of prompt characteristics, and refine prompt instances to verify findings. Through a case study, technical evaluations, and expert interviews, we demonstrate our system's effectiveness in helping users evaluate model security and identify model weaknesses.

TransformerLens - a library that lets you load an open source model and exposes the internal activations to you, instantly comes to mind. I wonder if Neel's work somehow inspired at least the name.

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

TLDR; a comparison of DPO and PPO (reward-based and reward-free) in relation to RLHF particularly why PPO performs poorly on academic benchmarks.

An excerpt from section 5. Key Factors to PPO for RLHF

We find three key techniques: (1) advantage normalization (Raffin et al., 2021), (2) large-batch-size training (Yu et al., 2022), and (3) updating the parameters of the reference model with exponential moving average (Ouyang et al., 2022).

From the ablation studies, it particularly finds large-batch-size training to be significantly beneficial especially on code generation tasks.

[-]Ann7mo10

Might be worth following up to see how ORPO compares. (Initial results suggest it's basically a better DPO.)

Also, another interesting detail is that PPO still shows superior performance on RLHF testbeds.

I am just from reading Nathan Lambert's analysis of DBRX, and it seems the DBRX demo to have a safety filtering in the loop even confirmed by one of the finetuning leads at Databricks. It sure is going to be interesting when I am jailbreaking it.

Here is an excerpt:

I just learnt of this newsletter; "AI News" which basically collects all news about AI into one email and sometimes it could be long considering it gathers everything from Twitter, Reddit and Discord. Overall, it is a great source of news. I sometimes, I find it hard to read everything but by skimming the table of contents, I can discover something interesting and go straight to it. For instance, here is the newsletter (too long I clipped it) for 23rd March 2024:

[-]cubefox8mo10

A shorter, more high level alternative is Axis of Ordinary, which is also available via Facebook and Telegram.

Cool! Will check it out!

Just stumbled across "Are Emergent Abilities of Large Language Models a Mirage?" paper and it is quite interesting. Can't believe I just came across this today. At a time, when everyone is quick to note "emergent capabilities" in LLMs, it is great to have another perspective (s).

Easily my favourite paper since "Exploiting Novel GPT-4 APIs"!!!

Remember, they are not "hallucinations", they are confabulations produced by dream machines i.e. the LLMs!

The UK AI Safety Institute: Should it work? That's how standard AI regulation organizations should be. No specific models; just use the current ones and report. Not to be a gatekeeper per se and just deter research right from the start. I am of the notion that not every nation needs to build its own AI.

The introduction of LPU(https://wow.groq.com/GroqDocs/TechDoc_Latency.pdf) changes the field completely on scaling laws, pivoting us to matters like latency.

[-]gwern9mo62

No it doesn't, not unless Groq wants to discuss publicly what the cost of that hardware was and it turns out to be, to everyone's shock, well under $5m... (And you shouldn't trust any periodical which wastes half an article on the topic of what Groq & Grok have to do with each other. There are many places you can get AI news, you don't have to read Coin Telegraph.)

Mmmh ok, I guess let us keep an eye out.

The UKAISI (UK AI Safety Institute) and US AI Safety Institute have just signed an agreement on how to "formally co-operate on how to test and assess risks from emerging AI models."

Red teaming, but not only internally, but using third party [external partners] who are a mixture of domain experts is the way to go. On that one, OAI really did a great move.

[-]Victor Ashioya7mo00

I found it interesting that both share the same name (not sure about the abbreviation) and now this first-of-its-kind bilateral agreement. Another interesting thing is that one side (Rishi Sunak is optimistic) and the Biden side is doomer-ish.

To quote the FT article, the partnership is modeled on the one between GCHQ and NSA.

[-]Victor Ashioya8mo03

Happy pi day everyone. Remember Math (Statistics, probability, Calculus etc) is a key foundation in AI and should not be trivialised.

[-]Victor Ashioya8mo-10

If Elon is suing OAI on the grounds of OSS, then it is hypocritical since neither is Grok and just maybe he has other motives...