PITTI

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

Regulations | Policy

EURACTIV - Just as the timing of the world’s first AI treaty starts aligning with the EU legislative agenda, an American-led push to exclude private companies might make it not worth the paper it is written on.

2023-06-05

RWKV: Reinventing RNNs for the Transformer Era

Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi and Ahmed Awadallah introduce Orca, a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instr...

2023-06-02

Information Processing | Computing,

StyleDrop: Text-To-Image Generation in Any Style

VIDEO | Yannic Kilcher takes a look at RWKV, a highly scalable architecture between Transformers and RNNs.

2023-06-01

Design | Culture,

Stella Biderman's Directory of LLMs

Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein and Dilip Krishnan (Google Research) present StyleDrop that enables the generation of images that faithfully follow a specific style, powered by Muse,...

2023-05-31

Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors

Stella Biderman shares resources where she documents key features of LLMs

2023-05-29

Brain-Computer Interface,

The Curse of Recursion: Training on Generated Data Makes Models Forget

Paul S. Scotti, Atmadeep Banerjee, Jimmie Goode, Stepan Shabalin, Alex Nguyen, Ethan Cohen, Aidan J. Dempster, Nathalie Verlinde, Elad Yundler, David Weisberg, Kenneth A. Norman and Tanishq Mathew Abraham present MindEye, a novel fMRI-to-image approach to retrieve and reconstruct viewed images from brain activity.

2023-05-27

Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot and Ross Anderson find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear.

2023-05-26

Information Processing | Computing

Research,

Zichang Liu, Aditya Desai, Fangshuo Liao, Weitao Wang, Victor Xie, Zhaozhuo Xu, Anastasios Kyrillidis and Anshumali Shrivastava propose Scissorhands, a system that maintains the memory usage of the KV cache at a fixed budget without finetuning the model.

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

2023-05-26

Information Processing | Computing,

The AI Attack Surface Map v1.0

Yao Yao, Zuchao Li and Hai Zhao propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph. By representing thought units as nodes and connections between them as edges, the approach captures the non-sequential nature of human thinking and allows for a more realistic...

2023-05-25

Security | Surveillance | Privacy

Daniel Miessler provides a first thrust at a framework for thinking about how to attack AI systems.

VOYAGER: An Open-Ended Embodied Agent with Large Language Models

2023-05-25

The False Promise of Imitating Proprietary LLMs

Research,

Social Science | Society

Guanzhi Wang and Yuqi Xie and Yunfan Jiang and Ajay Mandlekar and Chaowei Xiao and Yuke Zhu and Linxi Fan and Anima Anandkumar introduce VOYAGER, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human interven...

2023-05-25

A majority of Americans have heard of ChatGPT, but few have tried it themselves

Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine and Dawn Song critically analyze the emerging method to cheaply improve a weaker language model is to finetune it on outputs from a stronger model, such as a proprietary system like ChatGPT (e.g., Alpaca, Self-Instruct, and others).

2023-05-24

The impact of rare protein coding genetic variation on adult cognitive function | Nature

About six-in-ten U.S. adults (58%) are familiar with ChatGPT, though relatively few have tried it themselves, according to a Pew Research Center survey conducted in March. Among those who have tried ChatGPT, a majority report it has been at least somewhat useful

2023-05-23

Biology,

DNA | Evolution,

Information Processing | Computing,

QLoRA: Efficient Finetuning of Quantized LLMs

Nature - Chia-Yen Chen, Ruoyu Tian, Tian Ge, Max Lam, Gabriela Sanchez-Andrade, Tarjinder Singh, Lea Urpa, Jimmy Z. Liu, Mark Sanderson, Christine Rowley, Holly Ironfield, Terry Fang, Biogen Biobank Team, The SUPER-Finland study, The Northern Finland Intellectual Disability study, Mark Daly, Aarno Palotie, Ellen A. Tsai, Hail...

2023-05-23

To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis

Tim Dettmers, Artidoro Pagnoni, Ari Holtzman and Luke Zettlemoyer present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance.

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng and Yang You empirically investigate three key aspects under the approach of repeating the pre-training data for additional epochs

Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón and Sumit Sanghai introduce grouped-query attention (GQA), a generalization of multi-query attention which uses an intermediate (more than one, less than number of query heads) number of key-value heads.

How Rogue AIs may Arise

Security | Surveillance | Privacy

Regulations | Policy,

Yoshua Bengio, most known for his pioneering work in deep learning, earning him the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” with Geoffrey Hinton and Yann LeCun, analyses the AI risk

Hot Pixels: Frequency, Power, and Temperature Attacks on GPUs and ARM SoCs

Security | Surveillance | Privacy,

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Hritvik Taneja, Jason Kim, Jie Jeff Xu, Stephan van Schaik, Daniel Genkin and Yuval Yarom investigate the susceptibility of Arm SoCs and GPUs to information leakage via power, temperature and frequency, as measured via internal sensors

2023-05-21

Using AI to Decode Animal Communication with Aza Raskin

Research,

Design | Culture

Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt propose DragGAN, which consists of two main components including: 1) a feature-based motion supervision that drives the handle point to move towards the target position, and 2) a new point tracking approach that leverages the discrim...

2023-05-20

VIDEO | Aza Raskin, co-founder of Earth Species Project, shares how the latest advances in AI help us to better understand and learn from other species

Lit-GPT

2023-05-19

Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity

Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed

2023-05-19

Brain-Computer Interface,

From Theory to Reality: Explaining the Best Prompt Injection Proof of Concept

Zijiao Chen, Jiaxin Qing and Juan Helen Zhou propose Mind-Video that learns spatiotemporal information from continuous fMRI data of the cerebral cortex progressively through masked brain modeling, multimodal contrastive learning with spatiotemporal attention, and co-training with an augmented Stable Diffusion model that incor...

2023-05-19