Feynman Innovations's picture

Feynman Innovations

ajibawa-2023

·

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Recent Activity

reacted to DmitryRyumin's post with 🔥 6 days ago

🚀🤖🌟 New Research Alert - ICCV 2025 (Oral)! 🌟🤖🚀 📄 Title: Variance-based Pruning for Accelerating and Compressing Trained Networks 🔝 📝 Description: The one-shot pruning method efficiently compresses networks, reducing computation and memory usage while retaining almost full performance and requiring minimal fine-tuning. 👥 Authors: Uranik Berisha, Jens Mehnert, and Alexandru Paul Condurache 📅 Conference: ICCV, 19 – 23 Oct, 2025 | Honolulu, Hawai'i, USA 🇺🇸 📄 Paper: https://huggingface.co/papers/2507.12988 🚀 ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers 🚀 Added to the Efficient Learning Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/efficient-learning.md 📚 More Papers: more cutting-edge research presented at other conferences in the https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin 🔍 Keywords: #VarianceBasedPruning #NetworkCompression #ModelAcceleration #EfficientDeepLearning #VisionTransformers #AI #ICCV2025 #ResearchHighlight

reacted to onekq's post with 👍 6 days ago

Context rot is such a catchy phrase, but the problem has been identified 2+ years ago, called attention decay. https://huggingface.co/papers/2307.03172 I spotted the same problem in coding tasks, and documented in my book (https://www.amazon.com/dp/9999331130). Why did this problem become hot again? This is because many of us thought the problem has been solved by long context models, which is not true. Here we were misled by benchmarks. Most long-context benchmarks build around the QA scenario, i.e. "finding needle in haystack". But in agentic scenarios, the model needs to find EVERYTHING in the haystack, and just can't afford enough attention for this challenge.

reacted to di-zhang-fdu's post with 🔥 6 days ago

The training dataset of ChemVLM is open-sourced now, have a check! https://huggingface.co/datasets/di-zhang-fdu/chemvlm-sft-datasets papers: https://huggingface.co/papers/2408.07246

View all activity

Organizations

ajibawa-2023 's datasets 21

ajibawa-2023/Persona-100k

Viewer • Updated Jul 13 • 100k • 46 • 5

ajibawa-2023/Reasoning-Maths-College

Viewer • Updated Apr 24 • 965 • 45 • 2

ajibawa-2023/Audio-Children-Stories-Collection-Large

Viewer • Updated Apr 1 • 2.1k • 83 • 8

ajibawa-2023/Audio-Children-Stories-Collection

Viewer • Updated Mar 27 • 600 • 302 • 6

ajibawa-2023/Software-Architecture

Preview • Updated Oct 28, 2024 • 45 • 27

ajibawa-2023/Software-Architectural-Frameworks

Viewer • Updated Oct 4, 2024 • 1.26k • 34 • 9

ajibawa-2023/Maths-College

Viewer • Updated May 8, 2024 • 970k • 203 • 50

ajibawa-2023/Maths-Grade-School

Viewer • Updated May 8, 2024 • 980k • 118 • 27

ajibawa-2023/Education-College-Students

Viewer • Updated Apr 10, 2024 • 254k • 166 • 5

ajibawa-2023/Education-High-School-Students

Viewer • Updated Apr 10, 2024 • 255k • 28 • 9

ajibawa-2023/Education-Young-Children

Viewer • Updated Apr 10, 2024 • 256k • 167 • 13

ajibawa-2023/Education-Researchers

Viewer • Updated Apr 10, 2024 • 255k • 15 • 8

ajibawa-2023/Children-Stories-Collection

Viewer • Updated Mar 16, 2024 • 897k • 465 • 53

ajibawa-2023/General-Stories-Collection

Viewer • Updated Mar 16, 2024 • 1.07M • 346 • 35

ajibawa-2023/OpenHermes-2.5-Code-290k

Updated Feb 19, 2024 • 19 • 7

ajibawa-2023/Code-290k-ShareGPT

Viewer • Updated Jan 16, 2024 • 289k • 85 • 29

ajibawa-2023/Julia-Proof-Pile-2

Viewer • Updated Dec 26, 2023 • 293k • 13 • 4

ajibawa-2023/Code-74k-ShareGPT

Viewer • Updated Dec 8, 2023 • 73.9k • 32 • 18

ajibawa-2023/SlimOrca-ShareGPT

Viewer • Updated Nov 14, 2023 • 518k • 18 • 7

ajibawa-2023/Python-Code-23k-ShareGPT

Viewer • Updated Nov 11, 2023 • 22.6k • 76 • 41

ajibawa-2023/Mathjson

Viewer • Updated Nov 11, 2023 • 17.1k • 4 • 3