You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/docs/ApiReference/task.md
+9-10Lines changed: 9 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,11 +52,6 @@ The function or callable object defining the logic of the task. This is a positi
52
52
```python
53
53
from pyper import task
54
54
55
-
@task
56
-
defadd_one(x: int):
57
-
return x +1
58
-
59
-
# OR
60
55
defadd_one(x: int):
61
56
return x +1
62
57
@@ -123,19 +118,20 @@ When `join` is `False`, a producer-consumer takes each individual output from th
123
118
from typing import Iterable
124
119
from pyper import task
125
120
126
-
@task(branch=True)
127
121
defcreate_data(x: int):
128
122
return [x +1, x +2, x +3]
129
123
130
-
@task(branch=True, join=True)
131
124
defrunning_total(data: Iterable[int]):
132
125
total =0
133
126
for item in data:
134
127
total += item
135
128
yield total
136
129
137
130
if__name__=="__main__":
138
-
pipeline = create_data | running_total
131
+
pipeline = (
132
+
task(create_data, branch=True)
133
+
| task(running_total, branch=True, join=True)
134
+
)
139
135
for output in pipeline(0):
140
136
print(output)
141
137
#> 1
@@ -190,15 +186,18 @@ The parameter `throttle` determines the maximum size of a task's output queue. T
190
186
import time
191
187
from pyper import task
192
188
193
-
@task(branch=True, throttle=5000)
194
189
deffast_producer():
195
190
for i inrange(1_000_000):
196
191
yield i
197
192
198
-
@task
199
193
defslow_consumer(data: int):
200
194
time.sleep(10)
201
195
return data
196
+
197
+
pipeline = (
198
+
task(fast_consumer, branch=True, throttle=5000)
199
+
| task(slow_consumer)
200
+
)
202
201
```
203
202
204
203
In the example above, workers on `fast_producer` are paused after `5000` values have been generated, until workers for `slow_consumer` are ready to start processing again.
Note, however, that processes incur a very high overhead cost (performance cost in creation and memory cost in inter-process communication). Specific cases should be benchmarked to fine-tune the task parameters for your program / your machine.
@@ -115,7 +112,6 @@ In Pyper, it is especially important to separate out different types of work int
Whilst it makes sense to handle the network request concurrently, the call to `process_data` within the same task is blocking and will harm concurrency.
127
+
Whilst it makes sense to handle the network request concurrently, the call to `process_data` within the same task requires holding onto the GIL and will harm concurrency.
130
128
Instead, `process_data` should be implemented as a separate function:
Copy file name to clipboardExpand all lines: docs/src/docs/UserGuide/BasicConcepts.md
+10-8Lines changed: 10 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,23 +19,24 @@ Pyper follows the [functional paradigm](https://docs.python.org/3/howto/function
19
19
* Python functions are the building blocks used to create `Pipeline` objects
20
20
*`Pipeline` objects can themselves be thought of as functions
21
21
22
-
For example, to create a simple pipeline, we can wrap a function in the `task`decorator:
22
+
For example, to create a simple pipeline, we can wrap a function in the `task`class:
23
23
24
24
```python
25
25
from pyper import task
26
26
27
-
@task
28
27
deflen_strings(x: str, y: str) -> int:
29
28
returnlen(x) +len(y)
29
+
30
+
pipeline = task(len_strings)
30
31
```
31
32
32
-
This defines `len_strings` as a pipeline consisting of a single task. It takes the parameters `(x: str, y: str)` and generates `int` outputs from an output queue:
33
+
This defines `pipeline` as a pipeline consisting of a single task. It takes the parameters `(x: str, y: str)` and generates `int` outputs from an output queue:
Copy file name to clipboardExpand all lines: docs/src/docs/UserGuide/ComposingPipelines.md
+16-8Lines changed: 16 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,12 +53,10 @@ from typing import Dict, Iterable
53
53
54
54
from pyper import task
55
55
56
-
@task(branch=True)
57
56
defstep1(limit: int):
58
57
for i inrange(limit):
59
58
yield {"data": i}
60
59
61
-
@task
62
60
defstep2(data: Dict):
63
61
return data | {"hello": "world"}
64
62
@@ -72,18 +70,26 @@ class JsonFileWriter:
72
70
json.dump(data_list, f, indent=4)
73
71
74
72
if__name__=="__main__":
75
-
pipeline = step1|step2 # The pipeline
73
+
pipeline =task(step1, branch=True) | task(step2)# The pipeline
76
74
writer = JsonFileWriter("data.json") # A consumer
77
75
writer(pipeline(limit=10)) # Run
78
76
```
79
77
80
78
The `>` operator (again inspired by UNIX syntax) is used to pipe a `Pipeline` into a consumer function (any callable that takes an `Iterable` of inputs) returning simply a function that handles the 'run' operation. This is syntactic sugar for the `Pipeline.consume` method.
81
79
```python
82
80
if__name__=="__main__":
83
-
run = step1 | step2 > JsonFileWriter("data.json")
81
+
run = (
82
+
task(step1, branch=True)
83
+
| task(step2)
84
+
> JsonFileWriter("data.json")
85
+
)
84
86
run(limit=10)
85
87
# OR
86
-
run = step1.pipe(step2).consume(JsonFileWriter("data.json"))
88
+
run = (
89
+
task(step1, branch=True).pipe(
90
+
task(step2)).consume(
91
+
JsonFileWriter("data.json"))
92
+
)
87
93
run(limit=10)
88
94
```
89
95
@@ -163,12 +169,10 @@ from typing import AsyncIterable, Dict
163
169
164
170
from pyper import task
165
171
166
-
@task(branch=True)
167
172
asyncdefstep1(limit: int):
168
173
for i inrange(limit):
169
174
yield {"data": i}
170
175
171
-
@task
172
176
defstep2(data: Dict):
173
177
return data | {"hello": "world"}
174
178
@@ -182,7 +186,11 @@ class AsyncJsonFileWriter:
182
186
json.dump(data_list, f, indent=4)
183
187
184
188
asyncdefmain():
185
-
run = step1 | step2 > AsyncJsonFileWriter("data.json")
Copy file name to clipboardExpand all lines: docs/src/docs/UserGuide/CreatingPipelines.md
+3-13Lines changed: 3 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,26 +19,16 @@ Pyper's `task` decorator is the means by which we instantiate pipelines and cont
19
19
```python
20
20
from pyper import task, Pipeline
21
21
22
-
@task
23
22
deffunc(x: int):
24
23
return x +1
25
24
26
-
assertisinstance(func, Pipeline)
25
+
pipeline = task(func)
26
+
27
+
assertisinstance(pipeline, Pipeline)
27
28
```
28
29
29
30
This creates a `Pipeline` object consisting of one 'task' (one step of data transformation).
30
31
31
-
The `task` decorator can also be used more dynamically, which is preferable in most cases as this separates execution logic from the functional definitions themselves:
32
-
33
-
```python
34
-
from pyper import task
35
-
36
-
deffunc(x: int):
37
-
return x +1
38
-
39
-
pipeline = task(func)
40
-
```
41
-
42
32
In addition to functions, anything `callable` in Python can be wrapped in `task` in the same way:
0 commit comments