akothen

Akash K. akothen

17 followers · 4 following

Achievements

Highlights

Stars

FusedMindLab / TransFusion

An end-to-end Transformer fusion integrating DAG-based pipeline scheduling and whole encoder and decoder fusion.

Python 3 Updated Jul 17, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 15,925 3,141 Updated Oct 22, 2025

hannabdul / etf4asr

Official repo for the paper "An Effective Training Framework for Light-Weight Automatic Speech Recognition Models" accepted at InterSpeech 2025.

Lex 3 Updated Aug 15, 2025

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 3,107 230 Updated Sep 5, 2025

AIS-SNU / GraNNDis_Artifact

[PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and mini-batch training. Provides unification of full-/mini-batch t…

Python 11 2 Updated Aug 13, 2024

MerHS / pfeife

Pfeife: Automatic Pipeline Parallelism for PyTorch

Python 5 Updated May 13, 2025

memoryleak47 / slotted-egraphs

Rust 33 6 Updated Sep 11, 2025

microsoft / TrainVerify

A verification tool for ensuring parallelization equivalence in distributed model training.

Python 11 1 Updated Sep 1, 2025

verify-llm / TrainVerify

Forked from microsoft/TrainVerify

A verification tool for ensuring parallelization equivalence in distributed model training.

Python 5 Updated Sep 17, 2025

liquidslr / leetcode-company-wise-problems

Lists of company wise questions available on leetcode premium. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode …

8,468 1,756 Updated Jun 20, 2025

cornell-zhang / UniSparse

UniSparse: An Intermediate Language for General Sparse Format Customization (OOPSLA'24)

MLIR 31 Updated Nov 12, 2024

uwsampl / SparseTIR

SparseTIR: Sparse Tensor Compiler for Deep Learning

Python 138 15 Updated Mar 31, 2023

skiphoop / BOLT-benchmark

This project includes a prototype implementation of BOLT—a bandwidth-optimized, lightning-fast Oblivious Map—along with benchmarking code for performance comparisons.

C++ 1 Updated Aug 9, 2025

NeuRonICS-Lab / Processing-in-Interconnect-Codes

Jupyter Notebook 1 Updated Sep 15, 2025

SNU-CODElab / atim

Forked from yongwonshin/atim

Python 11 3 Updated Jun 2, 2025

fengbintu / Neural-Networks-on-Silicon

This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.

2,020 386 Updated Jul 16, 2025