Research and Innovation
Top 40 CVPR posters (11 minute read) A list of curated computer vision papers from the 2024 CVPR conference. |
Toucan TTS in 7000 languages (GitHub Repo) Toucan recently released a set of new text-to-speech models that it has extended to work with every ISO-639-3 standard language. |
Engineering & Resources:
1300 tokens per second on Mac (7 minute read) Implementing batch parallel KV cache in MLX leads to dramatic inference time speed up for synthetic data generation and model completions. |
Mixture of Attention in LLMs (GitHub Repo) The Mixture of Attention (MoA) approach optimizes sparse attention in large language models by tailoring unique sparse attention configurations for different heads and layers. |
Train vision models in TRL (GitHub Repo) TRL is a Hugging Face library for training transformers with reinforcement learning. This example allows you to also perform the same process for vision-based language models like LLaVA.
|