- https://jax-ml.github.io/scaling-book/
- ocr
- https://applied-llms.org/
- https://github.com/ray-project/llm-numbers
- https://github.com/KalyanKS-NLP/llm-engineer-toolkit
- https://koomen.dev/essays/horseless-carriages/ horseless carriages of AI
- https://github.com/mlabonne/llm-course
- https://github.com/preset-io/promptimize
- simple prompt engineering training from anthropic https://docs.google.com/spreadsheets/d/19jzLgRruG9kjUQNKtCg1ZjdD6l6weA6qRXG5zLIAhC8/edit?gid=150872633#gid=150872633 (Anthropic's Prompt Engineering Interactive Tutorial [PUBLIC ACCESS])
- https://colab.research.google.com/drive/1OnIipRwuHOZbKHN0haHGD0OnckBGfzqx auto optimierung
- prompting frameworks
- prompt techniques
- CoRT chain of recursive thoughts https://github.com/PhialsBasement/Chain-of-Recursive-Thoughts
safety
- https://github.com/ajndkr/lanarky
- https://github.com/deepset-ai/haystack (with Elasticsearch)
- https://x.com/clementdelangue/status/1934672721066991908
- https://simonwillison.net/2024/Dec/24/qvq/
- https://simonwillison.net/2024/Dec/26/deepseek-v3/
- unstract.com LLM whisperer https://pg.llmwhisperer.unstract.com/
- https://cloud.llamaindex.ai llamaparse
- https://github.com/DS4SD/docling
- https://github.com/VikParuchuri/marker
- https://github.com/QuivrHQ/MegaParse
- https://github.com/microsoft/markitdown
- https://github.com/opendatalab/PDF-Extract-Kit
- https://python.useinstructor.com/
- https://github.com/vlm-run/vlmrun-cookbook and good comments https://news.ycombinator.com/item?id=43187209
- https://github.com/boundaryml/baml
- https://github.com/NVIDIA/garak
- https://github.com/Azure/PyRIT
- https://www.promptfoo.dev/docs/red-team/quickstart/
- https://github.com/mbrg/power-pwn
- cryptography
- prompt injection scanning
- https://github.com/greshake/llm-security
- https://bishopfox.com/resources/rvasec2024-patch-perfect-harmonizing-llms
- https://www.microsoft.com/en-us/security/blog/2025/04/24/new-whitepaper-outlines-the-taxonomy-of-failure-modes-in-ai-agents/
- https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/ANSSI-BSI-joint-releases/LLM-based_Systems_Zero_Trust.pdf?__blob=publicationFile&v=3
- https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/
-
https://www.404media.co/email/f459caa7-1a58-4f31-a9ba-3cb53a5046a4/
-
https://www.npr.org/2024/12/10/nx-s1-5222574/kids-character-ai-lawsuit
-
https://gandalf.lakera.ai/baseline genAI security training
-
backdoor
-
email summarization copiarate https://www.youtube.com/watch?v=84NVG1c5LRI
-
uncensor with abliteration https://colab.research.google.com/drive/1VYm3hOcvCpbGiqKZb141gJwjdmmCcVpR?usp=sharing
identifying security issues
- graph aware scanning
- MCP
- https://github.com/cisco-ai-defense/mcp-scanner
- https://airedteamwhitepapers.blob.core.windows.net/lessonswhitepaper/MS_AIRT_Lessons_eBook.pdf
- https://www.microsoft.com/en-us/security/blog/2025/01/13/3-takeaways-from-red-teaming-100-generative-ai-products/
- https://github.com/Azure/PyRIT
- https://arxiv.org/pdf/2302.04222 glaze
- https://arxiv.org/pdf/2310.13828 night shade
- https://huggingface.co/docs/peft/index
- https://github.com/unslothai/unsloth
- https://github.com/axolotl-ai-cloud/axolotl
- https://github.com/Lightning-AI/litgpt
- https://github.com/hiyouga/LLaMA-Factory
- tool lists
- https://github.com/meta-llama/synthetic-data-kit
- https://pypi.org/project/tokenlearn/
- auto eval https://colab.research.google.com/drive/1Igs3WZuXAIv9X0vwqiE90QlEPys8e8Oa?usp=sharing
- axolotl https://colab.research.google.com/drive/1TsDKNo2riwVmU55gjuBgB1AXVtRRfRHW?usp=sharing
- unsloth https://colab.research.google.com/drive/164cg_O7SV7G8kZr_JXqLd6VC7pd86-1Z?usp=sharing
SVD and Marchenko-Pastur seem to be really useful
- https://github.com/arcee-ai/DistillKit https://www.youtube.com/watch?v=JE7SuP049mQ
- https://github.com/cognitivecomputations/spectrum https://www.youtube.com/watch?v=CTncBjRgktk
- https://github.com/arcee-ai/mergekit
- https://github.com/arcee-ai/EvolKit
- https://www.arcee.ai/blog/how-arcee-ai-helped-madeline-co-build-a-world-class-reasoning-model-from-first-principles- https://www.arcee.ai/product/arcee-conductor
- https://github.com/microsoft/DeepSpeed
- https://pytorch.org/tutorials/distributed/home.html
- https://lightning.ai/docs/pytorch/stable/
- https://github.com/hpcaitech/ColossalAI
- https://github.com/open-webui/open-webui
- https://lmstudio.ai/
- https://opencat.app/en/
- https://docs.chainlit.io/get-started/overview
- https://www.gradio.app/
- https://streamlit.io/
- https://github.com/llmware-ai/llmware
- https://github.com/infiniflow/ragflow?tab=readme-ov-file
- https://github.com/Filimoa/open-parse
- rag tools
- https://github.com/google-gemini/gemini-fullstack-langgraph-quickstart
- https://github.com/eyelevelai/groundx-on-prem
- https://github.com/morphik-org/morphik-core
- https://github.com/timescale/pgai
- https://github.com/humanlayer/12-factor-agents
- https://github.com/The-Pocket/PocketFlow
- https://claudiacode.com/
- https://github.com/Hona/opencode-ralph
- firmware max https://buy.stripe.com/14A5kC4W79ee0Pm42ea3u01?prefilled_promo_code=EARLYBIRDZ
@code-simplifier:code-simplifier
- 10.17323/jle.2020.11046
- https://livekit.io/
- https://huggingface.co/Zyphra/Zonos-v0.1-hybrid
- https://github.com/nari-labs/dia
- https://github.com/KoljaB/RealtimeVoiceChat
- https://github.com/resemble-ai/chatterbox
- https://handy.computer/
- explanation videos
- https://github.com/containers/ramalama
- https://ollama.com/
- https://github.com/vllm-project/vllm
- https://huggingface.co/moonshotai/Kimi-Audio-7B-Instruct
- https://github.com/sgl-project/sglang
on demand AI clusters
- all the public cloud providers (though they often do not focus on offering a simple/streamlined GPU experience for researchers)
- https://hpc-ai.com
- https://lambdalabs.com/service/gpu-cloud/1-click-clusters
- https://vast.ai/
- https://nebius.com/
- https://verda.com/instant-clusters
- runpod
- hyperstack
- coreweave
- crusoe
- https://www.paperspace.com
- https://huggingface.co/spaces
- https://colab.research.google.com/
- https://www.anyscale.com
- https://www.fluidstack.io
- https://www.clustermax.ai/
- eu
- check cooling issues
- https://www.runpod.io/
- https://github.com/exo-explore/exo
- https://github.com/canopyai/Orpheus-TTS
- https://github.com/fixie-ai/ultravox
- https://www.openai.fm/
- qwen 2.5 omni
- https://zeyuet.github.io/AudioX/
- religion
- https://github.com/ggml-org/LlamaBarn
- https://lmstudio.ai/
- https://anythingllm.com/
- https://localai.io/
typing
constriants
- https://news.ycombinator.com/item?id=46916586
- strong typing
- linting and formatting
- cyclomatic complexity rules
- max line length
- max lines per file
- unused code analysis
- duplication analysis
- modularisation enforcement
- security check
- scripts to ensure shared/util directories are not over stuffed
market research
workflow
IT/coding
security
some stacks
data agent
legal
HR & career
Agents
Design
finance
unclassified
- https://github.com/yvgude/lean-ctx
- https://github.com/router-for-me/CLIProxyAPI
- https://github.com/Soju06/codex-lb and https://github.com/snipeship/ccflare
- https://github.com/prakersh/codexmultiauth
- https://github.com/juyterman1000/entroly
- https://github.com/ruvnet/ruflo
- https://github.com/sergebulaev/linkedin-skills
loops
routing