Posts by Tags

Is your multi agent systems really multi agentic?

6 minute read

In this post, we will explore the concept of multi agent LLM systems, distinguishing between true multi agent systems and what are essentially modular single-agent systems. Read more

Physics in Video Language Models

6 minute read

In this post, we explore the physics behind text to video language models and how they can be used to generate realistic videos from text prompts. Read more

Tokenization: The Red Pill to see past The Matrix

7 minute read

Tokenization is a necessary and often overlooked component in large language models. In this post, we explore the importance of tokenization and how it might very well be the key to unlocking the advanced abilities of what we expect AI to be in the future. Read more

Open-ended vs Descriptive Prompting: Finding the right balance in your AI use-cases

6 minute read

When working with language models, the way we phrase our prompts can dramatically affect the quality and usefulness of the responses we receive. In this post we explore the dichotomy in setting up prompts - being open-ended vs descriptive - and understand the implication of each. Read more

Part 2 - The Hidden Geometry of Large Language Models: A New Perspective on Reasoning

2 minute read

At Tenyx, we’ve delved into the intricate workings of Large Language Models (LLMs) to uncover the geometric structures underlying their reasoning capabilities. Our research provides new insights into how LLMs process information and the implications for improving their reasoning abilities. Read more

Part 1 - The Hidden Geometry of Large Language Models: Implications on Safety & Toxicity‍

3 minute read

At Tenyx, we’ve spent countless hours peering into the intricate workings of Large Language Models (LLMs). Today, we’re excited to share our research, in collaboration with Brown University, that sheds light on the geometric structures and transformations governing these models. Our work provides new insights into how LLMs process their inputs and the implications for AI safety in applications driven by LLMs. Read more

Forgetting and Toxicity in LLMs: A Deep Dive on Fine-Tuning Methods

6 minute read

Fine-tuning is a common procedure by which a pretrained language model is updated with training on a domain-specific dataset to improve performance in that domain (i.e. a chatbot to answer enterprise-specific Q&A, a hotel booking agent). It has been known for some time (if not widely appreciated) that fine-tuning a model on new data degrades its performance on the initial pretraining dataset (the dreaded “catastrophic forgetting” problem in ML). But by how much? And do all fine-tuning methods degrade performance in the same ways, and to the same extent? Read more

Understanding MLPs as Hashing Functions: A Geometric Perspective

4 minute read

When we think about Multi-Layer Perceptrons (MLPs), we often visualize them as interconnected neurons processing information. However, there’s an elegant alternative perspective - viewing MLPs as hashing functions that partition input space and mapping functions on these partitions. Read more

Is your multi agent systems really multi agentic?

6 minute read

In this post, we will explore the concept of multi agent LLM systems, distinguishing between true multi agent systems and what are essentially modular single-agent systems. Read more

Understanding MLPs as Hashing Functions: A Geometric Perspective

4 minute read

When we think about Multi-Layer Perceptrons (MLPs), we often visualize them as interconnected neurons processing information. However, there’s an elegant alternative perspective - viewing MLPs as hashing functions that partition input space and mapping functions on these partitions. Read more

Revisiting k-nearest neighbor benchmarks in self-supervised learning

5 minute read

Standard protocols for benchmarking self-supervised models involve using a linear or k-nearest neighbor classification on frozen features of the learned model. However, both evaluations are sensitive to hyperparameters making the evaluation and comparison complicated. Read more

Difficulties in training a Generative Adversarial Network

7 minute read

Generative modeling is a branch of machine learning that attempts at modeling the probability distribution of high dimensional data, for example - images Read more

A notion of uncertainty in modern neural networks

1 minute read

In this post, we will look at a particular view of uncertainty in modern deep learning systems using droput Read more

A preliminary on generative modeling using neural networks

5 minute read

Generative modeling is a branch of machine learning that attempts at modeling the probability distribution of high dimensional data, for example - images Read more

Neural network compression using optimal brain damage

5 minute read

Model pruning in neural networks was the answer I ended up with when I got wondering about the workings of dropout, dropconnect and papers like Do Deep Nets Really Need to be Deep? Read more

A brief introduction to activation functions in neural networks

5 minute read

Neural networks can be viewed as layers of building blocks neurons made up of weights, biases and activation function. A fundamental understanding of how these basic units function could help one in achieving one’s objective Read more

Efficient graph construction to represent images

5 minute read

Image processing over the years has evolved from simple linear averaging filters to highly adaptive non linear filtering operations such as the bilateral filter (BF), moving least squares, BM3D and LARK to name a few. Read more

Representing data using graphs: A sparse signal approximation view

7 minute read

Graph driven machine learning has seen a surge of interest in the past few years with several applications in social sciences, biology, and network analysis, to name a few. However, in some scenarios, no graph is given a priori and one one has to infer and construct a graph to fit the data given. Read more

Efficient graph construction to represent images

5 minute read

Image processing over the years has evolved from simple linear averaging filters to highly adaptive non linear filtering operations such as the bilateral filter (BF), moving least squares, BM3D and LARK to name a few. Read more

Is your multi agent systems really multi agentic?

6 minute read

In this post, we will explore the concept of multi agent LLM systems, distinguishing between true multi agent systems and what are essentially modular single-agent systems. Read more

Revisiting k-nearest neighbor benchmarks in self-supervised learning

5 minute read

Standard protocols for benchmarking self-supervised models involve using a linear or k-nearest neighbor classification on frozen features of the learned model. However, both evaluations are sensitive to hyperparameters making the evaluation and comparison complicated. Read more

Difficulties in training a Generative Adversarial Network

7 minute read

Generative modeling is a branch of machine learning that attempts at modeling the probability distribution of high dimensional data, for example - images Read more

A notion of uncertainty in modern neural networks

1 minute read

In this post, we will look at a particular view of uncertainty in modern deep learning systems using droput Read more

A preliminary on generative modeling using neural networks

5 minute read

Generative modeling is a branch of machine learning that attempts at modeling the probability distribution of high dimensional data, for example - images Read more

Neural network compression using optimal brain damage

5 minute read

Model pruning in neural networks was the answer I ended up with when I got wondering about the workings of dropout, dropconnect and papers like Do Deep Nets Really Need to be Deep? Read more

A brief introduction to activation functions in neural networks

5 minute read

Neural networks can be viewed as layers of building blocks neurons made up of weights, biases and activation function. A fundamental understanding of how these basic units function could help one in achieving one’s objective Read more

Open-ended vs Descriptive Prompting: Finding the right balance in your AI use-cases

6 minute read

When working with language models, the way we phrase our prompts can dramatically affect the quality and usefulness of the responses we receive. In this post we explore the dichotomy in setting up prompts - being open-ended vs descriptive - and understand the implication of each. Read more

Part 2 - The Hidden Geometry of Large Language Models: A New Perspective on Reasoning

2 minute read

At Tenyx, we’ve delved into the intricate workings of Large Language Models (LLMs) to uncover the geometric structures underlying their reasoning capabilities. Our research provides new insights into how LLMs process information and the implications for improving their reasoning abilities. Read more

Representing data using graphs: A sparse signal approximation view

7 minute read

Graph driven machine learning has seen a surge of interest in the past few years with several applications in social sciences, biology, and network analysis, to name a few. However, in some scenarios, no graph is given a priori and one one has to infer and construct a graph to fit the data given. Read more

Part 1 - The Hidden Geometry of Large Language Models: Implications on Safety & Toxicity‍

3 minute read

At Tenyx, we’ve spent countless hours peering into the intricate workings of Large Language Models (LLMs). Today, we’re excited to share our research, in collaboration with Brown University, that sheds light on the geometric structures and transformations governing these models. Our work provides new insights into how LLMs process their inputs and the implications for AI safety in applications driven by LLMs. Read more

Forgetting and Toxicity in LLMs: A Deep Dive on Fine-Tuning Methods

6 minute read

Fine-tuning is a common procedure by which a pretrained language model is updated with training on a domain-specific dataset to improve performance in that domain (i.e. a chatbot to answer enterprise-specific Q&A, a hotel booking agent). It has been known for some time (if not widely appreciated) that fine-tuning a model on new data degrades its performance on the initial pretraining dataset (the dreaded “catastrophic forgetting” problem in ML). But by how much? And do all fine-tuning methods degrade performance in the same ways, and to the same extent? Read more

Tokenization: The Red Pill to see past The Matrix

7 minute read

Tokenization is a necessary and often overlooked component in large language models. In this post, we explore the importance of tokenization and how it might very well be the key to unlocking the advanced abilities of what we expect AI to be in the future. Read more

Physics in Video Language Models

6 minute read

In this post, we explore the physics behind text to video language models and how they can be used to generate realistic videos from text prompts. Read more

Sarath Shekkizhar, Ph.D.

Posts by Tags

LLM

MLP

agents

deep learning

graph construction

image processing

multi agent

neighborhood methods

neural networks

prompting

reasoning

representation learning

safety

tokenization

video