Git's data model is a directed acyclic graph (DAG).
- Commits point to trees (the project snapshot) + zero or more parent commits.
- Trees point to blobs (file contents) and other trees (subdirectories).
- Blobs are leaves — raw file bytes.
A single commit references the entire project tree. Files that haven't changed across commits share the same blob hash — no duplication.
The DAG enables:
- Merges — a commit can have multiple parents (octopus = N parents).
- Branches — pointers into the graph that move forward.
- Distributed sync — any clone has the full graph; pushing/pulling exchanges missing nodes.
- Reachability checks —
git fsck walks the graph; git gc deletes unreachable nodes.
Visualize the DAG:
git log --graph --oneline --all