515 lines (345 loc) · 13.2 KB

LLM

good content

prompts

https://github.com/preset-io/promptimize
simple prompt engineering training from anthropic https://docs.google.com/spreadsheets/d/19jzLgRruG9kjUQNKtCg1ZjdD6l6weA6qRXG5zLIAhC8/edit?gid=150872633#gid=150872633 (Anthropic's Prompt Engineering Interactive Tutorial [PUBLIC ACCESS])
https://colab.research.google.com/drive/1OnIipRwuHOZbKHN0haHGD0OnckBGfzqx auto optimierung
prompting frameworks
- https://github.com/masci/banks
- https://github.com/boundaryml/baml
prompt techniques
- CoRT chain of recursive thoughts https://github.com/PhialsBasement/Chain-of-Recursive-Thoughts

hooks

safety

https://www.youtube.com/watch?v=VqDs46A8pqE

tokenization

https://github.com/openai/tiktoken

vector databases

hallucination

https://docs.vectara.com/docs/learn/hallucination-evaluation

private

https://bdtechtalks.com/2023/06/01/create-privategpt-local-llm/amp/

stacks

serving LLMs

https://github.com/ajndkr/lanarky
https://github.com/deepset-ai/haystack (with Elasticsearch)

full e2e example

ready made connectors

no code GUI tools

models

document preprocessors

unstract.com LLM whisperer https://pg.llmwhisperer.unstract.com/
- https://github.com/Zipstack/unstract
https://cloud.llamaindex.ai llamaparse
https://github.com/DS4SD/docling
https://github.com/VikParuchuri/marker
https://github.com/QuivrHQ/MegaParse
https://github.com/microsoft/markitdown
https://github.com/opendatalab/PDF-Extract-Kit

HTML cleanup

https://emschwartz.me/comparing-13-rust-crates-for-extracting-text-from-html/

pdf preprocessors

entity extraction

naming entities

https://github.com/urchade/GLiNER

security

scraper & automation

jailbreaks

https://github.com/CHATS-lab/persuasive_jailbreaker

demonstrations

ethical challenges

https://www.youtube.com/watch?v=giT0ytynSqg

security considerations

security problems

identifying security issues

graph aware scanning
- https://hiddenlayer.com/innovation-hub/shadowgenes-uncovering-model-genealogy/
- https://arxiv.org/abs/2501.11830
MCP
https://github.com/cisco-ai-defense/mcp-scanner

red teaming

data poisoning

https://arxiv.org/pdf/2302.04222 glaze
https://arxiv.org/pdf/2310.13828 night shade
- https://nightshade.cs.uchicago.edu/whatis.html

fine-tuning

newer efficient way of fine-tuning and distillation

SVD and Marchenko-Pastur seem to be really useful

distributed training

prototyping

UIs

router/proxy

https://github.com/katanemo/archgw
litellm

rag frameworks

chunking

https://github.com/HaroldConley/chunk-norris

graph rag

https://github.com/Muvon/octocode

rag applications

https://arxiv.org/abs/2409.13740

workflows

https://workflowai.com/

agents

@code-simplifier:code-simplifier

https://www.atcyrus.com/stories/claude-code-code-simplifier-agent-guide

agent.md/claude.md tips

https://blog.sshh.io/p/how-i-use-every-claude-code-feature

parallel agentic development

https://conductor.build/

harnesses / stacks

memory

interesting skills/templates

verification & Testing

10.17323/jle.2020.11046

education

audio

video

explanation videos

hosting models

hardware

on demand AI clusters

all the public cloud providers (though they often do not focus on offering a simple/streamlined GPU experience for researchers)
https://hpc-ai.com
https://lambdalabs.com/service/gpu-cloud/1-click-clusters
https://vast.ai/
https://nebius.com/
https://verda.com/instant-clusters
runpod
hyperstack
coreweave
crusoe
https://www.paperspace.com
https://huggingface.co/spaces
https://colab.research.google.com/
https://www.anyscale.com
https://www.fluidstack.io
https://www.clustermax.ai/
eu
- https://www.firebird.ai/

selecting the right model for the hardware

https://github.com/AlexsJones/llmfit

validation of hardware

check cooling issues
- https://github.com/huggingface/gpu-fryer
https://www.runpod.io/
- https://github.com/3b1b/manim
https://github.com/exo-explore/exo

video generation

https://github.com/Wan-Video/Wan2.2

realtime

audio

serving

https://fastrtc.org/

models

solutions

data analysis with AI

https://github.com/microsoft/data-formulator

paper finders

https://allenai.org/blog/paper-finder

applications

schooling and tutoring

https://www.synthesis.com/tutor

ai and politics

religion
- vatican AI rules https://www.vatican.va/roman_curia/congregations/cfaith/documents/rc_ddf_doc_20250128_antiqua-et-nova_en.html

mcp

registry

https://github.com/IBM/mcp-context-forge

applications

local inferencing

observability

https://github.com/Arize-ai/phoenix

useful learnings

guardrails

typing

https://github.com/germanicdev/germanic

constriants

https://news.ycombinator.com/item?id=46916586
- strong typing
- linting and formatting
- cyclomatic complexity rules
- max line length
- max lines per file
- unused code analysis
- duplication analysis
- modularisation enforcement
- security check
- scripts to ensure shared/util directories are not over stuffed

prompts

market research

https://x.com/socialwithaayan/status/2021233357514997824

stuff to explore more

workflow

IT/coding

https://github.com/DietrichGebert/ponytail

security

https://agent-safehouse.dev/

some stacks

data agent

https://github.com/getnao/nao

legal

https://github.com/zubair-trabzada/ai-legal-claude

HR & career

https://github.com/santifer/career-ops

Agents

https://github.com/nousresearch/hermes-agent

Design

https://styles.refero.design/

finance

https://github.com/TauricResearch/TradingAgents

unclassified

loops

https://cobusgreyling.github.io/loop-engineering/

routing

https://github.com/workweave/router