On-Device AI: Deploying Quantized LLMs on the Edge | Kisaco Research
Speaker(s): 

Author:

Shreya Singhal

Generative AI Research & Development
Aristocrat Gaming

Shreya Singhal works in Generative AI Research & Development at Aristocrat Gaming and is a Graduate Research Assistant at the University of Texas at Austin. Her work spans large language models (LLMs), reinforcement learning, and optimization for scalable and interpretable AI systems.

 

Shreya has hands-on experience fine-tuning open-source models like Gemma 2B for under-resourced languages, deploying compressed generative models in low-resource environments, and implementing bias and fairness evaluation pipelines using interpretable subspace analysis. She has previously worked at Dell Technologies, Charles Schwab, Deloitte, and Accenture, contributing to AI-powered solutions across gaming, finance, and enterprise automation.

 

Her current research focuses on efficient LLM training and evaluation pipelines, fairness-aware model design, and bringing generative AI to edge and enterprise use cases. She is passionate about making AI more inclusive, scalable, and grounded in real-world constraints.

 

Shreya Singhal

Generative AI Research & Development
Aristocrat Gaming

Shreya Singhal works in Generative AI Research & Development at Aristocrat Gaming and is a Graduate Research Assistant at the University of Texas at Austin. Her work spans large language models (LLMs), reinforcement learning, and optimization for scalable and interpretable AI systems.

 

Shreya has hands-on experience fine-tuning open-source models like Gemma 2B for under-resourced languages, deploying compressed generative models in low-resource environments, and implementing bias and fairness evaluation pipelines using interpretable subspace analysis. She has previously worked at Dell Technologies, Charles Schwab, Deloitte, and Accenture, contributing to AI-powered solutions across gaming, finance, and enterprise automation.

 

Her current research focuses on efficient LLM training and evaluation pipelines, fairness-aware model design, and bringing generative AI to edge and enterprise use cases. She is passionate about making AI more inclusive, scalable, and grounded in real-world constraints.

 

Time: 
5:25 PM - 5:45 PM
Agenda Track No.: 
Track 3
Session Type: 
Track
Session Stage: