Skip to content

Commit 5547b4d

Browse files
authored
Introduce raw_alloc, raw_ptr, raw_ref (#371)
- rename `gc_alloc` and `ptr` to `raw_alloc` and `raw_ptr` - use the syntax `raw_alloc[T]` instead of `raw_alloc(T)` - introduce `unsafe::raw_ref` - re-introduce `gc_alloc` as a temporary alias to `raw_alloc` (to be fixed) - write some docs
2 parents d595bb6 + 2a1a6c8 commit 5547b4d

31 files changed

Lines changed: 820 additions & 327 deletions

docs/mkdocs.yml

Lines changed: 38 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
site_name: SPy Documentation
1+
site_name: SPy
22
docs_dir: src
33
repo_url: https://github.com/spylang/spy
44
repo_name: spylang/spy
@@ -22,25 +22,53 @@ theme:
2222
icon: material/brightness-4
2323
name: Switch to light mode
2424
features:
25-
- navigation.tabs
2625
- navigation.sections
2726
- navigation.expand
2827
- navigation.top
2928
- search.suggest
3029
- search.highlight
3130
- content.code.copy
31+
32+
markdown_extensions:
33+
- attr_list
34+
- pymdownx.highlight:
35+
anchor_linenums: true
36+
line_spans: __span
37+
pygments_lang_class: true
38+
- pymdownx.superfences
39+
- pymdownx.snippets:
40+
base_path: ['..']
41+
- pymdownx.inlinehilite
42+
- pymdownx.blocks.admonition
43+
- pymdownx.tabbed:
44+
alternate_style: true
45+
46+
nav:
47+
- Home: index.md
48+
- contributing.md
49+
- Unsorted notes:
50+
- Low-level memory model: llmem.md
51+
52+
# Example navigation structure with subsections:
53+
# nav:
54+
# - Home: index.md
55+
# - Getting Started:
56+
# - Introduction: getting-started/intro.md
57+
# - Installation: getting-started/installation.md
58+
# - Quick Start: getting-started/quickstart.md
59+
# - User Guide:
60+
# - Overview: user-guide/overview.md
61+
# - Basic Concepts: user-guide/concepts.md
62+
# - Advanced Topics: user-guide/advanced.md
63+
# - API Reference:
64+
# - Functions: api/functions.md
65+
# - Classes: api/classes.md
66+
# - FAQ: faq.md
67+
3268
extra:
3369
version:
3470
provider: mike
3571
social:
3672
- icon: fontawesome/brands/github
3773
link: https://github.com/spylang/spy
3874
name: SPy on GitHub
39-
40-
markdown_extensions:
41-
- pymdownx.superfences
42-
- pymdownx.highlight
43-
- pymdownx.inlinehilite
44-
- pymdownx.blocks.admonition
45-
- pymdownx.tabbed:
46-
alternate_style: true

docs/src/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# <img src="assets/logo.svg" width="60" style="vertical-align: middle;"> SPy
1+
# <img src="assets/logo.svg" width="60" style="vertical-align: middle;"> Home
22

33

44
## What is SPy?

docs/src/llmem.md

Lines changed: 280 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,280 @@
1+
# Low-level memory model
2+
3+
SPy aims to be a "two-level" language, with a low-level, possibly unsafe core, upon
4+
which to build higher level abstractions which we expect end users to use.
5+
6+
It might help to draw a parallel with CPython: the core of the interpreter and of many
7+
libraries is written in C, which is low level and inherently unsafe. The result is a
8+
safe high-level language which is what most people see and use.
9+
10+
This document describes the low-level memory model of SPy.
11+
12+
While reading this document, it is worth remembering that SPy has two main mode of
13+
execution, interpreted and compiled.
14+
15+
The SPy compiler works by translating `*.spy` code into `*.c` code (after
16+
[redshifting](https://antocuni.eu/2025/10/29/inside-spy-part-1-motivations-and-goals/#redshifting)),
17+
which is then compiled by e.g. `gcc` or `clang`. In the next sections, we will also
18+
explain how SPy types are translated into C.
19+
20+
## Save vs unsafe code
21+
22+
By default, SPy code is **safe**, meanging that:
23+
24+
1. memory and lifetimes are managed automatically
25+
26+
2. you cannot corrupt memory or access memory which was already freed
27+
28+
However, SPy also offers the `unsafe` module, for writing low-level code and for
29+
specialized cases. At the moment of writing, the `unsafe` module can be imported
30+
freely, but the plan is to allow unsafe code only in few specific and clearly labeled
31+
section of the program.
32+
33+
34+
## Primitive types
35+
36+
At the core, we have primitive numeric types such as `i32`, `f32`, `f64`, etc. These are
37+
translated into their C equivalent `int32_t`, `float`, `double`, etc.
38+
39+
Moreover, SPy defines the `int` and `float` aliases, which maps to `i32` and `f64`
40+
respectively. At the moment this is hardcoded, but eventually the precise mapping will
41+
depend on the target platform:
42+
43+
44+
<div align="right"><sub><a href="https://github.com/spylang/spy/blob/37ee3e29a7707618adf107ca7d8d19de2942ab55/spy/vm/modules/builtins.py#L222-L230">See on GitHub</a></sub></div>
45+
46+
```python title="spy/vm/modules/builtins.py @ 37ee3e29" linenums="222"
47+
# add aliases for common types. For now we map:
48+
# int -> i32
49+
# float -> f64
50+
#
51+
# We might want to map int to different concrete types, depending on the
52+
# platform? Or maybe have some kind of "configure step"?
53+
BUILTINS.add("int", BUILTINS.w_i32)
54+
BUILTINS.add("float", BUILTINS.w_f64)
55+
```
56+
57+
## Stack-allocated structs
58+
59+
We can define C-like structs:
60+
61+
```python
62+
@struct
63+
class Point:
64+
x: int
65+
y: int
66+
```
67+
68+
Structs can be instantiated directly, and are **immutable**:
69+
70+
```python
71+
p = Point(1, 2)
72+
print(p.x, p.y)
73+
74+
p.x = 3 # TypeError
75+
```
76+
77+
Structs have **inline storage**:
78+
79+
1. if used as local variables, they are allocated "on the stack";
80+
81+
2. if used as fields of a bigger struct, they are allocated "inline" the bigger
82+
struct;
83+
84+
3. they are passed by value, which means that passing around big structs can be
85+
costly.
86+
87+
88+
For example:
89+
90+
```python
91+
@struct
92+
class Rect:
93+
a: Point
94+
b: Point
95+
96+
assert sizeof(Point) == sizeof(int) * 2
97+
assert sizeof(Rect) == sizeof(int) * 4
98+
99+
r = Rect(Point(1, 2), Point(3, 4))
100+
```
101+
102+
The compiler translates them into plain C structs, something along these lines:
103+
104+
```c
105+
typedef struct {
106+
int32_t x;
107+
int32_t y;
108+
} Point;
109+
110+
typedef struct {
111+
Point a;
112+
Point b;
113+
} Rect;
114+
115+
Point p = {1, 2};
116+
Rect r = {(Point){1, 2}, (Point){3, 4}};
117+
```
118+
119+
Stack-allocated structs are always safe to use.
120+
121+
## Raw and GC memory
122+
123+
The heap is conceptually divided into two main regions: **raw memory** and **GC memory**.
124+
The low-level manipulation of both areas of memory is **unsafe**.
125+
126+
Raw memory is "C style":
127+
128+
- memory is allocated with `raw_alloc[T]`; pointers are of type `raw_ptr[T]`;
129+
130+
- the memory must be explicitly released by calling `raw_free[T]` (NOT IMPLEMENTED
131+
YET!)
132+
133+
- it is responsibility of the programmer to avoid use-after-free and out-of-bounds
134+
access;
135+
136+
- once allocated, the address of the memory is non-movable and can be safely passed to
137+
3rd party libraries
138+
139+
GC memory:
140+
141+
- memory is allocated with `gc_alloc[T]`; pointers are of type `gc_ptr[T]`;
142+
143+
- the memory is automatically released by the GC when it's no longer needed;
144+
145+
- it is *still* responsibility of the programmer to avoid out-of-bounds access to
146+
arrays;
147+
148+
- objects are potentially **movable** (depending on the GC strategy), so their address
149+
might change;
150+
151+
- it is possible to get a temporary non-movable `raw_ptr` by "pinning" a `gc_ptr` (NOT
152+
IMPLEMENTED YET!).
153+
154+
/// warning
155+
GC is not implemented yet; currently `gc_alloc` is just an alias to `raw_alloc`,
156+
meaning that it leaks memory
157+
///
158+
159+
160+
## Heap-allocated structs
161+
162+
We can allocated structs "on the heap". This is a lower-level functionality which
163+
requires the use of `unsafe` functions; they can be allocated both in raw and GC memory:
164+
165+
```python
166+
from unsafe import raw_ptr, raw_alloc
167+
168+
p1: raw_ptr[Point] = raw_alloc[Point](1)
169+
p1.x = 1
170+
p1.y = 2
171+
```
172+
173+
Contrarily to their stack-allocated counterparts, heap-allocated structs are mutable.
174+
you should think of heap-allocated structs as the basic building blog for all
175+
higher-level types.
176+
177+
It might be helpful to draw again a parallel to CPython: in CPython, objects of type
178+
`tuple` and `str` are immutable, but under the hood they are implemented by mutable heap
179+
allocated structs written in C.
180+
181+
182+
183+
## Raw allocation
184+
185+
`raw_alloc[T](n)` allocates an **array** of `T` on the heap. To allocate a single
186+
element, you just pass `n = 1`. For convenience, if `T` is a struct you can access it's
187+
fields without having to use `[0]`, exactly as in C:
188+
189+
```python
190+
def test(p: raw_ptr[Point]) -> None:
191+
assert p.x == p[0].x
192+
assert p.y == p[0].y
193+
```
194+
195+
The low-level representation of pointers depends on the excecution mode.
196+
197+
The interpreter keeps track of the address **and the length** of the allocated region,
198+
and checks for out-of-bounds access:
199+
200+
<div align="right"><sub><a href="https://github.com/spylang/spy/blob/8a360bc11d95db09fee34964ce3cab6639c06f1f/spy/vm/modules/unsafe/ptr.py#L128-L150">See on GitHub</a></sub></div>
201+
```python title="spy/vm/modules/unsafe/ptr.py @ 8a360bc1" linenums="128"
202+
@UNSAFE.builtin_type("__base_ptr")
203+
class W_BasePtr(W_Object):
204+
[...]
205+
w_ptrtype: W_BasePtrType
206+
addr: fixedint.Int32
207+
length: fixedint.Int32 # how many items in the array
208+
```
209+
210+
The same happens in **debug compiled mode**, where `raw_ptr[T]` is translated to a fat
211+
pointer. Finally, in **release compiled mode**, `raw_ptr[T]` is translated as a plain C
212+
pointer, and there is no out-of-bounds check:
213+
214+
215+
<div align="right"><sub><a href="https://github.com/spylang/spy/blob/8a360bc11d95db09fee34964ce3cab6639c06f1f/spy/libspy/include/spy/unsafe.h#L12-L18">See on GitHub</a></sub></div>
216+
```c title="spy/libspy/include/spy/unsafe.h @ 8a360bc1" linenums="12"
217+
typedef struct Ptr_T {
218+
T *p;
219+
#ifdef SPY_PTR_CHECKED
220+
size_t length;
221+
#endif
222+
} Ptr_T;
223+
224+
```
225+
226+
## GC allocation
227+
228+
**Not implemented yet**
229+
230+
See the **very rough** [plan](https://github.com/antocuni/spy-memory-model)
231+
232+
233+
## Raw references: `raw_ref[T]`
234+
235+
Structs and pointers are loosely modeled against C, but there is a big semantic
236+
difference between Python and C that we need to take into account in order to provide an
237+
intuitive way to deal with structs.
238+
239+
Consider the following example, using the `Rect` and `Point` structs defined above. It
240+
modifies a **nested** struct:
241+
242+
```python
243+
def test(r: raw_ptr[Rect]) -> None:
244+
r.a.x = 0
245+
```
246+
247+
In Python (and thus SPy) the above expression decomposes to:
248+
249+
```python
250+
tmp = r.a
251+
tmp.x = 0
252+
```
253+
254+
or, more explicitly:
255+
256+
```python
257+
tmp = getattr(r, "a")
258+
setattr(tmp, "x", 0)
259+
```
260+
261+
The naive implementation of `r.a` would be to return a *copy* of the `Point` but this
262+
means that `tmp.x` would modify the copy, not the original.
263+
264+
To solve the problem, we return a **reference** instead:
265+
266+
```python
267+
tmp: raw_ref[Point] = r.a
268+
tmp.x = 0
269+
```
270+
271+
Contrarily to pointers, references cannot be indexed and cannot be `NULL`. Moreover, a
272+
`raw_ref[T]` can be automatically converted into a `T`. E.g. consider the following:
273+
274+
```python
275+
def foo(r: raw_ptr[Rect]) -> None:
276+
r2: Rect = r # ERROR: cannot convert raw_ptr[Rect] to Rect
277+
p: Point = r.a # works: r.a is raw_ref[Point], and it's converted to a Point
278+
```
279+
280+
In the C backend, `raw_ref[T]` is implemented in the exact same way as `raw_ptr[T]`.

examples/multifile/myarray.spy

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ def array1d(DTYPE):
1515
__ll__: ptr[ArrayData]
1616

1717
def __new__(l: i32) -> ndarray:
18-
data = gc_alloc(ArrayData)(1)
18+
data = gc_alloc[ArrayData](1)
1919
data.l = l
20-
data.items = gc_alloc(DTYPE)(l)
20+
data.items = gc_alloc[DTYPE](l)
2121
return ndarray.__make__(data)
2222

2323
def __getitem__(self, i: i32) -> DTYPE:
@@ -52,10 +52,10 @@ def array2d(DTYPE):
5252
__ll__: ptr[ArrayData]
5353

5454
def __new__(h: i32, w: i32) -> ndarray:
55-
data = gc_alloc(ArrayData)(1)
55+
data = gc_alloc[ArrayData](1)
5656
data.h = h
5757
data.w = w
58-
data.items = gc_alloc(DTYPE)(h * w)
58+
data.items = gc_alloc[DTYPE](h * w)
5959
return ndarray.__make__(data)
6060

6161
def __getitem__(self, i: i32, j: i32) -> DTYPE:

examples/point.spy

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ class Point:
1818

1919
# eventually structs will get an automatic ctor, like dataclasses
2020
def new_point(x: f64, y: f64) -> ptr[Point]:
21-
p = gc_alloc(Point)(1) # allocate 1 Point
21+
p = gc_alloc[Point](1) # allocate 1 Point
2222
p.x = x
2323
p.y = y
2424
return p

0 commit comments

Comments
 (0)