Re: a dozen cpu's on a chip



On 13 mei, 17:13, John Larkin
<jjlar...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On Tue, 13 May 2008 08:05:12 -0700 (PDT), pantel...@xxxxxxxxx wrote:
On 8 mei, 04:48, John Larkin
<jjlar...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
http://www.eetimes.com/news/latest/showArticle.jhtml;jsessionid=CESEX...

I bet we'll see 256 one of these days.

John

It is of course completely of-topic, but jus tto contribute for
test :-) to
the noise, I am sure I have seen a 512 core chip several years ago.

The problem is what to do with > 6 cores.
As you al probably know Sony PS3 has a Cell processor
with one big and 6? small 'helper' processors.
Now in a multimedia application, or networking, two ways,
say signal processsing decryption decoding graphics that
will maybe use 4 cores.
It is not easy to slit a program over more then one core.
Even if threaded, it makes not always sense,
I have written threaded programs where some threads use very few
resources,
running those on a separate core woul make little sense,
Some multi media stuff uses no threads at all (Linux mplayer IIRC),
while others, xine media player for example _is_ threaded.
And this is from the POV of embedded.
Now sure, you could run some FPGA synthesize on one core,
PCB routing on the other, SPICE on a third.. however how often
do you use it at the same time.
So, and I am not even thinking Microsoft, they only have binaries for
X86 of
their OS, but the software that takes full advantage of so many cores
for a _general purpose_ OS, has, as far as I know, not been invented
yet.
And are sequential cores always the best solution? Not sure,
in the above example the decryption could be done faster by FPGA (1
clock) perhaps.

So, unless they come up with a software solution that makes full use
of those cores, perhaps the only other option is to try to up the
clock speed,
new techniques to reduce power consumption are mentioned here and
there.

My XP, not doing much right now, claims to be running 31 processes.
Add in maybe another 30 device drivers, tcp/ip stacks, and file
managers, and it would keep a 64-core cpu mostly employed.

Yes, but would it run faster?
That is the issue.
Many of those processes use very few resources, tha twas th point I
was trying to make.
And if that is so, it is no use to assign those to their own cores.

Look at the below list of this Linux system (ps av):
Most of the proceses do nothing, there runs a h246 encode in the
background,
name server, mail server, ftp server, htttp server, and processor use
is
Cpu(s): 2.7% us, 5.4% sy, 41.8% ni, 48.5% id, 0.7% wa, 0.3% hi,
0.7% s

Load factor nicely balances around 1.0

~ # ps avx
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
1 ? Ss 0:06 17 29 1878 648 0.1 init [2]
2 ? SN 0:00 0 0 0 0 0.0 [ksoftirqd/0]
3 ? S 0:00 0 0 0 0 0.0 [watchdog/0]
4 ? S< 0:00 0 0 0 0 0.0 [events/0]
5 ? S< 0:00 0 0 0 0 0.0 [khelper]
6 ? S< 0:00 0 0 0 0 0.0 [kthread]
29 ? S< 0:00 0 0 0 0 0.0 [kblockd/0]
30 ? S< 0:00 0 0 0 0 0.0 [kacpid]
93 ? S< 0:00 0 0 0 0 0.0 [kseriod]
115 ? S< 0:44 0 0 0 0 0.0 [kswapd0]
116 ? S< 0:00 0 0 0 0 0.0 [aio/0]
117 ? S< 0:00 0 0 0 0 0.0 [jfsIO]
118 ? S< 0:00 0 0 0 0 0.0 [jfsCommit]
119 ? S< 0:00 0 0 0 0 0.0 [jfsSync]
120 ? S< 0:00 0 0 0 0 0.0 [xfslogd/0]
121 ? S< 0:00 0 0 0 0 0.0 [xfsdatad/0]
288 ? S< 0:00 0 0 0 0 0.0 [kpsmoused]
292 ? S< 0:00 0 0 0 0 0.0 [reiserfs/0]
367 ? S<s 0:01 0 52 2043 844 0.2 udevd --
daemon
662 pts/7 S+ 0:01 1 44 2183 1116 0.2 top
775 ? S< 0:00 0 0 0 0 0.0 [kgameportd]
786 ? S< 0:00 0 0 0 0 0.0
[ksuspend_usbd]
787 ? S< 0:00 0 0 0 0 0.0 [khubd]
812 pts/1 SN 0:45 0 467 4408 1404 0.3 sh /usr/local/
sbin/temp_to_lcd
2022 ? S 0:00 0 11 1548 396 0.1 sleep 60
2087 ? Ss 0:00 0 284 4547 1148 0.2 /usr/sbin/
sshd
2167 pts/1 S 0:00 0 11 3604 464 0.1 sleep 55
2269 pts/1 SN 0:00 0 11 3604 468 0.1 sleep 10
2274 ? S 0:00 0 11 1548 396 0.1 sleep 1
2275 pts/6 R+ 0:00 0 63 3992 756 0.1 ps avx
2540 ? Ss 0:00 0 16 1563 592 0.1 /usr/sbin/
acpid -c /etc/acpi/events -s /var/run/acpid.socket
2822 ? SNs 0:00 1 86 1973 760 0.1 /sbin/syslog-
ng -p /var/run/syslog-ng.pid
2830 ? Ss 0:00 0 25 2126 732 0.1 /usr/sbin/
cron
2838 tty1 Ss 0:00 1 27 2588 1304 0.3 /bin/login
--
2839 tty2 Ss+ 0:00 0 10 1561 488 0.1 /sbin/getty
38400 tty2
2840 tty3 Ss+ 0:00 0 10 1565 488 0.1 /sbin/getty
38400 tty3
2841 tty4 Ss+ 0:00 0 10 1565 492 0.1 /sbin/getty
38400 tty4
2842 tty5 Ss+ 0:00 0 10 1561 488 0.1 /sbin/getty
38400 tty5
2843 tty6 Ss+ 0:00 0 10 1561 484 0.1 /sbin/getty
38400 tty6
2844 ? SNs 2:11 0 61 1562 440 0.1 /usr/sbin/gpm
-m /dev/psaux -t autops2 -r 30 -Rms3
2914 ? Ss 0:00 6 467 10188 7056 1.8 /usr/share/
apache2/bin/httpd -k start
2917 ? S 0:00 2 467 10348 7372 1.9 /usr/share/
apache2/bin/httpd -k start
2918 ? S 0:00 0 467 10348 7376 1.9 /usr/share/
apache2/bin/httpd -k start
2919 ? S 0:00 0 467 10348 7456 1.9 /usr/share/
apache2/bin/httpd -k start
2920 ? S 0:00 2 467 10348 7472 1.9 /usr/share/
apache2/bin/httpd -k start
2921 ? S 0:00 5 467 10348 7484 1.9 /usr/share/
apache2/bin/httpd -k start
2922 ? Ss 0:00 0 298 3253 1328 0.3 sendmail:
accepting connections on port 25
2927 ? Ss 0:00 0 16 2511 832 0.2 inetd
2936 ? S 0:00 26 2665 15182 2804 0.7 /usr/local/
pgsql/bin/postmaster -i -D /usr/local/pgsql/data
2937 ? S 0:06 0 467 2320 1292 0.3 sh /usr/local/
sbin/get-temps
2943 ? S 9:31 0 467 2580 1468 0.3 sh /usr/local/
sbin/test-sensors
2945 ? Ss 0:00 1 636 2903 948 0.2 proftpd:
(accepting connections)
2972 ? Ss 0:00 250 236 5435 1980 0.5 cupsd
2973 ? S 0:00 0 2665 15182 932 0.2 postgres:
writer process
2974 ? S 0:00 0 2665 5970 780 0.2 postgres:
stats buffer process
2975 ? S 0:00 0 2665 5398 1148 0.2 postgres:
stats collector process
3012 ? S 0:00 0 5 1554 360 0.0 /usr/local/
bin/test-nokia-email -r -v 2
3061 tty1 S 0:00 9 467 5836 2668 0.6 -zsh
3122 tty1 S+ 0:00 0 467 4676 1720 0.4 /bin/sh /usr/
X11R6/bin/startx
3132 tty1 S+ 0:00 4 7 2332 660 0.1 xinit /
root/.xinitrc -- -layout wide_screen
3133 ? S< 269:40 25 1416 99475 49876 12.9 X :0 -layout
wide_screen
3159 tty1 S 0:00 0 467 4676 1628 0.4 /bin/sh /
root/.xinitrc
3161 tty1 S 0:00 16 109 4302 2712 0.7 xfm -appmgr -
geometry 1430x872+0+0
3162 tty1 S 0:00 12 975 7760 2060 0.5 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+0+900
3163 tty1 S 0:00 10 975 7860 2392 0.6 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+0+1800
3164 tty1 S 0:03 15 975 19524 14212 3.6 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+1440+0
3165 tty1 S 0:00 11 975 7760 2152 0.5 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+1440+900 -rv
3166 tty1 S 0:01 20 975 9048 3632 0.9 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+1440+1800
3167 tty1 S 0:00 20 975 19520 13580 3.5 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+2880+0
3168 tty1 S 0:00 9 975 7756 2124 0.5 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+2880+900
3169 tty1 R 0:02 18 975 16236 10752 2.7 /usr/bin/X11/
rxvt -fn 7x14 -ls -sl 10000 -geometry 202x62+2880+1800
3171 tty1 S 0:01 2 48 2631 1212 0.3 xmixer -
geometry 507x230+2225+1513 -device /dev/mixer -poll -all
3172 tty1 S 0:01 2 48 2631 1212 0.3 xmixer -
geometry 597x230+1524+1506 -device /dev/mixer1 -poll -all
3173 tty1 S 0:36 5 101 2666 1412 0.3 fvwm
3180 pts/0 Ss+ 0:00 0 467 5680 2644 0.6 -zsh
3181 pts/1 Ss+ 0:00 2 467 5832 2760 0.7 -zsh
3182 pts/2 Ss+ 0:00 0 467 5832 2732 0.7 -zsh
3183 pts/3 Ss 0:00 0 467 5680 2616 0.6 -zsh
3184 pts/4 Ss+ 0:00 0 467 5676 2580 0.6 -zsh
3185 pts/5 Ss+ 0:00 0 467 5832 2756 0.7 -zsh
3186 pts/6 Ss 0:00 0 467 5832 2756 0.7 -zsh
3187 pts/7 Ss 0:00 0 467 5832 2756 0.7 -zsh
3290 ? S 0:34 1 23 2328 652 0.1 xbindkeys
3302 pts/1 S 0:00 0 467 4672 1636 0.4 sh /usr/local/
sbin/test-update-mcamip
3531 pts/3 S+ 0:00 25 86 5181 2152 0.5 telnet
10.0.0.152
3956 tty1 S 0:00 0 975 9620 4152 1.0 /usr/bin/X11/
rxvt -font 7x14 -ls -sl 10000
3957 pts/8 Ss 0:00 0 467 5836 2704 0.7 -zsh
4005 pts/8 S+ 0:01 1 5 1838 528 0.1 ptlrc
4923 ? Sl 0:27 7 396 21459 8056 2.0 NewsFleX
6055 ? S 0:00 0 467 10348 7464 1.9 /usr/share/
apache2/bin/httpd -k start
6100 ? S 0:00 0 467 10348 7456 1.9 /usr/share/
apache2/bin/httpd -k start
8057 ? Ss 0:02 11 1858 15149 12948 3.3 named -d 2 -
c /etc/named.conf
11651 tty1 S 0:03 1 7 5792 1980 0.5 xhcs -shared
11654 tty1 S 0:28 0 10 1765 680 0.1 hcs
11691 tty1 Sl 1:32 5 47 16197 3440 0.8 xmpl
12418 ? S 0:00 1 467 10348 7480 1.9 /usr/share/
apache2/bin/httpd -k start
12451 ? S 0:00 0 467 10348 7448 1.9 /usr/share/
apache2/bin/httpd -k start
18064 tty1 S 0:00 0 467 4412 1400 0.3 /bin/sh /root/
compile/firefox/firefox/firefox
18072 tty1 S 0:00 0 467 4408 1396 0.3 /bin/sh /root/
compile/firefox/firefox/run-mozilla.sh /root/compile/firefox/firefox/
firefox-bin
18084 tty1 Sl 12:01 796 10088 201263 115092 29.8 /root/
compile/firefox/firefox/firefox-bin
19510 pts/1 RN 146:12 8 27 13748 9308 2.4 /usr/local/
bin/mcamip -o -x -f 2 -t -a 10.0.0.151 -p 80 -u USER -w PASSWORD -y
19511 pts/1 SN 132:57 3 4731 20580 13000 3.3 /usr/local/
bin/ffmpeg -f yuv4mpegpipe -i - -f avi -vcodec h264 -b 160 -g 100 -bf
2 -y /mnt/hdd1/camera4-1721.avi
23165 ? S 0:00 1 467 10348 7488 1.9 /usr/share/
apache2/bin/httpd -k start
28753 ? S 0:00 0 0 0 0 0.0 [pdflush]
29101 ? S 0:18 0 0 0 0 0.0 [pdflush]
31173 ? S 0:00 16 975 7540 2420 0.6 rxvt -
geometry 80x28 -sl 500 -fn 10x20 -e joe /root/.NewsFleX/postings/
current/temp
31174 pts/9 Ss+ 0:00 6 217 4502 1388 0.3 joe /
root/.NewsFleX/postings/current/temp


So in this case no extra core needed, it would make no difference!


Again, it's not about speed, although that would be improved too.

So how about 10 GHz or 20 GHz clock, would that not make more sense?

The speed thing is hitting the wall, which is why everybody is going
multicore.

It was hitting the heat barrier at around 4GHz several years ago.
Now we have 3 W 1 GHz processors it seems, they have made progress in
reducing power consumption... so 'the wall' may have moved already?


.