• R/O
  • SSH

Commit

Tags
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

Commit MetaInfo

Revisão7fadd002c1bb9f841f5bf3fa467c17710c9ff9e7 (tree)
Hora2022-06-28 03:38:43
AutorAlbert Mietus < albert AT mietus DOT nl >
CommiterAlbert Mietus < albert AT mietus DOT nl >

Mensagem de Log

ASIS. Split BusyCores in Use & Analyse -- keep blog-lenght limited and more text is needs to be added

Mudança Sumário

Diff

diff -r 0188fe6c45ab -r 7fadd002c1bb CCastle/1.Usage/7.BusyCores.rst
--- a/CCastle/1.Usage/7.BusyCores.rst Mon Jun 27 17:08:00 2022 +0200
+++ b/CCastle/1.Usage/7.BusyCores.rst Mon Jun 27 20:38:43 2022 +0200
@@ -60,45 +60,7 @@
6060 When the number of cores raises this does not scale; more and more cores become idle. Now, your code has to use both
6161 concurrency_ and parallelisme_. But also handle Critical-Sections_, Semaphores_ (and frieds) to synchronise tasks.
6262 |BR|
63-There is more below the horizon then just “Threads_”!
64-
65-Some concepts
66-=============
67-
68-Before we dive into the needs for Castle, lets define --shortly-- the available, theoretical concepts. Routinely, we add
69-wikipedia links for a deep-dive.
70-
71-.. include:: BusyCores-sidebar-concurrency.irst
72-
73-Concurrency
74------------
75-Concurrency_ is the ability to “compute” multiple things at the same time, instead of doing them one after the other. It requires another mindset, but isn’t that complicated.
76-A typical example is a loop: suppose we have a sequence of numbers and we like to compute the square of each one. Most developers will loop over those numbers, get one number, calculate the square, store it in another list, and continue with the next element. It works, but we have also instructed the computer to do it in sequence — especially when the task is bit more complicated, the compiler does know whether the ‘next task’ depends on the current one, and can’t optimise it.
77-
78-A better plan is to tell the compiler about the tasks; most are independently: square a number. There is also one that has to be run at the end: combine the results into a new list. And one is bit funny: distribute the sequence-elements over the “square-tasks” — clearly, one has to start with this one, but it can be concurrent with many others too.
79-
80-
81-Parallelisme
82-------------
83-Parallelisme_ is about executing multiple tasks (apparently) at the same time. We will focus running multiple
84-concurrent task (of the same program) on as many cores as possible. And when we assume we have a thousand cores we need
85-(at least) a thousand independent tasks — at any moment— to gain maximal speed up. This is not trivial!
86-|BR|
87-It’s not only about doing a thousand things at the same time (that is not to complicated, for a computer), but also — probably: mostly — about finishing a thousand times faster…
88-
89-With many cores, multiple program-steps can be executed at the same time: from changing the same variable, acces the
90-same memory, or compete for new memory. And when solving that, we introduce new hazards: like deadlocks_ and even
91-livelocks_.
92-
93-Locking
94-
95-
96-
97-Distributed
98------------
99-A special form of parallelisme is Distributed-Computing_: compute on many computers. Many experts consider this
100-as an independent field of expertise; still --as Multi-Core_ is basically “many computers on a chips”-- its there is an
101-analogue [#DistributedDiff]_, and we should the know-how that is available there to design out “best ever language”.
63+.. There is more below the horizon then just “Threads_”!
10264
10365
10466 Threading
@@ -121,25 +83,14 @@
12183
12284 .. rubric:: Footnotes
12385
124-.. [#DistributedDiff]
125- There a two (main) differences between Distributed-Computing_ and Multi-Core_. Firstly, all “CPUs” in
126- Distributed-Computing_ are active, independent and asynchronous. There is no option to share a “core” (as
127- commonly/occasionally done in Multi-process/Threaded programming); nor is there “shared memory” (one can only send
128- messages over a network).
129- |BR|
130- Secondly, collaboration with (network based) messages is a few orders slower then (shared) memory communication. This
131- makes it harder to speed-up; the delay of messaging shouldn't be bigger as the acceleration do doing thing in
132- parallel.
133- |BR|
134- But that condition does apply to Multi-Core_ too. Although the (timing) numbers do differ.
86+.. [#FN]
87+ Footnote
88+
13589
13690 .. _Multi-Core: https://en.wikipedia.org/wiki/Multi-core_processor
13791 .. _Concurrency: https://en.wikipedia.org/wiki/Concurrency_(computer_science)
13892 .. _Parallelisme: https://en.wikipedia.org/wiki/Parallel_computing
139-.. _Distributed-Computing: https://en.wikipedia.org/wiki/Distributed_computing
14093 .. _Critical-Sections: https://en.wikipedia.org/wiki/Critical_section
14194 .. _Semaphores: https://en.wikipedia.org/wiki/Semaphore_(programming)
14295 .. _Threads: https://en.wikipedia.org/wiki/Thread_(computing)
14396 .. _Heisenbugs: https://en.wikipedia.org/wiki/Heisenbug
144-.. _deadlocks: https://en.wikipedia.org/wiki/Deadlock
145-.. _livelocks: https://en.wikipedia.org/wiki/Deadlock#Livelock
diff -r 0188fe6c45ab -r 7fadd002c1bb CCastle/1.Usage/BusyCores-sidebar-concurrency.irst
--- a/CCastle/1.Usage/BusyCores-sidebar-concurrency.irst Mon Jun 27 17:08:00 2022 +0200
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,35 +0,0 @@
1-.. -*- rst -*-
2- included in `6.BusyCores.rst`
3-
4-.. sidebar::
5-
6- .. tabs::
7-
8- .. tab:: Ordered
9-
10- Here, the programmer has (unwittingly) defined a sequential order.
11-
12- .. code-block:: python
13-
14- L2 = []
15- for n in L1:
16- L2.append(power(n))
17-
18- .. note:: As ``power()`` could have side-effects, the compiler **must** keep the defined order!
19-
20- .. tab:: Concurrent
21-
22- Now, without a specified order, the same functionality has become concurrent.
23-
24- .. code-block:: python
25-
26- L2 = [power(n) for n in L1]
27-
28- .. note::
29-
30- Although (current) python-compilers will run it sequentially, it is *allowed* to distribute it; even when
31- ``power()`` has side-effects!
32- |BR|
33- As long as *python* put the results in the correct order in list ``L2`` **any order** is allowed. “Out of
34- order” side-effects are allowed by this code.
35-
diff -r 0188fe6c45ab -r 7fadd002c1bb CCastle/2.Analyse/8.BusyCores-concepts.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/CCastle/2.Analyse/8.BusyCores-concepts.rst Mon Jun 27 20:38:43 2022 +0200
@@ -0,0 +1,93 @@
1+.. include:: /std/localtoc.irst
2+
3+.. _MC-concepts:
4+
5+=======================
6+Concepts for Many Cores
7+=======================
8+
9+.. post::
10+ :category: Castle DesignStudy
11+ :tags: Castle, Concurrency
12+
13+ Effectively making benefit thousands of cores, as I announced in :ref:`BusyCores` is not easy. Many languages put it
14+ on the shoulders of the developer: usually by referring to pthreads_.
15+ |BR|
16+ But there is more below the horizon then just “Threads_”!
17+
18+ Let discover some concepts that can help to design prober support, and unburden the typical (moderen, embedded)
19+ developer.
20+
21+----------
22+
23+CUT & PAST
24+
25+----------
26+
27+Some concepts
28+=============
29+
30+Before we dive into the needs for Castle, lets define --shortly-- the available, theoretical concepts. Routinely, we add
31+wikipedia links for a deep-dive.
32+
33+.. include:: BusyCores-sidebar-concurrency.irst
34+
35+Concurrency
36+-----------
37+Concurrency_ is the ability to “compute” multiple things at the same time, instead of doing them one after the other. It requires another mindset, but isn’t that complicated.
38+A typical example is a loop: suppose we have a sequence of numbers and we like to compute the square of each one. Most developers will loop over those numbers, get one number, calculate the square, store it in another list, and continue with the next element. It works, but we have also instructed the computer to do it in sequence — especially when the task is bit more complicated, the compiler does know whether the ‘next task’ depends on the current one, and can’t optimise it.
39+
40+A better plan is to tell the compiler about the tasks; most are independently: square a number. There is also one that has to be run at the end: combine the results into a new list. And one is bit funny: distribute the sequence-elements over the “square-tasks” — clearly, one has to start with this one, but it can be concurrent with many others too.
41+
42+
43+Parallelisme
44+------------
45+Parallelisme_ is about executing multiple tasks (apparently) at the same time. We will focus running multiple
46+concurrent task (of the same program) on as many cores as possible. And when we assume we have a thousand cores we need
47+(at least) a thousand independent tasks — at any moment— to gain maximal speed up. This is not trivial!
48+|BR|
49+It’s not only about doing a thousand things at the same time (that is not to complicated, for a computer), but also — probably: mostly — about finishing a thousand times faster…
50+
51+With many cores, multiple program-steps can be executed at the same time: from changing the same variable, acces the
52+same memory, or compete for new memory. And when solving that, we introduce new hazards: like deadlocks_ and even
53+livelocks_.
54+
55+Locking
56+
57+
58+
59+Distributed
60+-----------
61+A special form of parallelisme is Distributed-Computing_: compute on many computers. Many experts consider this
62+as an independent field of expertise; still --as Multi-Core_ is basically “many computers on a chips”-- its there is an
63+analogue [#DistributedDiff]_, and we should the know-how that is available there to design out “best ever language”.
64+
65+
66+--------
67+
68+END
69+
70+----------
71+
72+.. rubric:: Footnotes
73+
74+.. [#DistributedDiff]
75+ There a two (main) differences between Distributed-Computing_ and Multi-Core_. Firstly, all “CPUs” in
76+ Distributed-Computing_ are active, independent and asynchronous. There is no option to share a “core” (as
77+ commonly/occasionally done in Multi-process/Threaded programming); nor is there “shared memory” (one can only send
78+ messages over a network).
79+ |BR|
80+ Secondly, collaboration with (network based) messages is a few orders slower then (shared) memory communication. This
81+ makes it harder to speed-up; the delay of messaging shouldn't be bigger as the acceleration do doing thing in
82+ parallel.
83+ |BR|
84+ But that condition does apply to Multi-Core_ too. Although the (timing) numbers do differ.
85+
86+.. _pthreads: https://en.wikipedia.org/wiki/Pthreads
87+.. _Threads: https://en.wikipedia.org/wiki/Thread_(computing)
88+.. _Multi-Core: https://en.wikipedia.org/wiki/Multi-core_processor
89+
90+.. _deadlocks: https://en.wikipedia.org/wiki/Deadlock
91+.. _livelocks: https://en.wikipedia.org/wiki/Deadlock#Livelock
92+.. _Critical-Sections: https://en.wikipedia.org/wiki/Critical_section
93+.. _Distributed-Computing: https://en.wikipedia.org/wiki/Distributed_computing
diff -r 0188fe6c45ab -r 7fadd002c1bb CCastle/2.Analyse/BusyCores-sidebar-concurrency.irst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/CCastle/2.Analyse/BusyCores-sidebar-concurrency.irst Mon Jun 27 20:38:43 2022 +0200
@@ -0,0 +1,35 @@
1+.. -*- rst -*-
2+ included in `6.BusyCores.rst`
3+
4+.. sidebar::
5+
6+ .. tabs::
7+
8+ .. tab:: Ordered
9+
10+ Here, the programmer has (unwittingly) defined a sequential order.
11+
12+ .. code-block:: python
13+
14+ L2 = []
15+ for n in L1:
16+ L2.append(power(n))
17+
18+ .. note:: As ``power()`` could have side-effects, the compiler **must** keep the defined order!
19+
20+ .. tab:: Concurrent
21+
22+ Now, without a specified order, the same functionality has become concurrent.
23+
24+ .. code-block:: python
25+
26+ L2 = [power(n) for n in L1]
27+
28+ .. note::
29+
30+ Although (current) python-compilers will run it sequentially, it is *allowed* to distribute it; even when
31+ ``power()`` has side-effects!
32+ |BR|
33+ As long as *python* put the results in the correct order in list ``L2`` **any order** is allowed. “Out of
34+ order” side-effects are allowed by this code.
35+
diff -r 0188fe6c45ab -r 7fadd002c1bb Makefile
--- a/Makefile Mon Jun 27 17:08:00 2022 +0200
+++ b/Makefile Mon Jun 27 20:38:43 2022 +0200
@@ -53,3 +53,6 @@
5353 wc:
5454 @echo "lines words file"
5555 @wc -lw `find CCastle/ -iname \*rst`|sort -r
56+
57+sidebar:
58+ @grep "include::" `find CCastle/ -type f -name \*.rst` /dev/null | grep sidebar| sort| sed 's/:../:\t\t ../'