Papers
Papers by year released in reversed chronological order.
2025
2025
- arXivDynaGuard: A Dynamic Guardrail Model With User-Defined PoliciesarXiv preprint arXiv:2509.02563,
- NeurIPSScaling up test-time compute with latent reasoning: A recurrent depth approachNeural Information Processing Systems (Spotlight), 2025
- COLMRefusal Tokens: A Simple Way to Calibrate Refusals in Large Language ModelsSecond Conference on Language Modeling, 2025
2024
2024
- SC24Democratizing AI: Open-source Scalable LLM Training on GPU-based SupercomputersSC24: International Conference for High Performance Computing, Networking, Storage and Analysis (Gordon Bell Finalist), 2024
- arXivExploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUsarXiv preprint arXiv:2502.06766,
2023
2023
2022
2022
- How to Do a Vocab Swap? A Study of Embedding Replacement for Pre-trained TransformersPreprint, 2022
2020
2020
- Springer